Archive for the ‘ORM’ Category

ODB 2.1.0 released

Tuesday, September 18th, 2012

ODB 2.1.0 was released today.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris. Supported C++ compilers include GCC, MS Visual C++, Sun CC, and Clang.

This release packs a long list of new features (note to ourselves: if the NEWS entries are over a page long — time to release). The major ones include the ability to use accessor and modifier functions/expressions to access data members, the ability to declare virtual data members, the ability to define database indexes, as well as support for mapping extended database types, such as geospatial types, user-defined types, and collections. There are also notable additions to the profile libraries. The Boost profile now includes support for the Multi-Index and Uuid libraries while the Qt profile now supports the QUuid type. Furthermore, there is a number of improvements in individual database support, especially for SQLite (see below).

We have also added Visual Studio 2012 and Clang 3.1 to the list of compilers that we use for testing. Specifically, all the runtime libraries, examples, and tests now come with project/solution files for Visual Studio 2012 in addition to 2010 and 2008. As always, below I am going to examine these and other notable new features in more detail. For the complete list of changes, see the official ODB 2.1.0 announcement.

Accessors and Modifiers

ODB now supports multiple ways to access data members in persistent objects, views, and value types. Now, if the data member is not accessible directly, the ODB compiler will try to automatically discover suitable accessor and modifier functions. By default the ODB compiler will look for names in the form: get/set_foo(), get/setFoo(), get/setfoo(), as well as just foo(). You can also add custom name derivations with the --accessor-regex and --modifier-regex options. Here is an example:

 
#pragma db object
class person
{
public:
  const std::string& name () const;
  void setName (const std::string&);
 
private:
  std::string name_; // Using name() and setName().
 
  ...
};
 

If the ODB compiler was unable to find a suitable accessor or modifier function, then we can specify one explicitly with the new get and set pragmas. For example:

 
#pragma db object
class person
{
public:
  const std::string& get_full_name () const;
  std::string& set_full_name ();
 
private:
  #pragma db get(get_full_name) set(set_full_name)
  std::string name_;
 
  ...
};
 

In fact, it doesn’t have to be just a function. Rather, it can be an accessor or modifier expression. Here is a more interesting example:

 
#pragma db object
class person
{
  public:
    const char* name () const;
    void name (const char*);
 
  private:
    #pragma db get(std::string (this.name ())) \
               set(this.name ((?).c_str ()))
    std::string name_;
 
  ...
};
 

For more information on automatic discovery of accessors and modifiers, refer to Section 3.2, “Declaring Persistent Objects and Values” in the ODB manual. For details on how to specify custom accessor/modifier expressions, see Section 12.4.5, “get/set/access” as well as the access example in the odb-examples package.

Virtual Data Members

A virtual data member is an imaginary data member that is only used for the purpose of database persistence. A virtual data member does not actually exist (that is, occupy space) in the C++ class.

At first, the idea of a virtual data member may seem odd but if you think about it, it’s a natural extension of the accessor/modifier support discussed above. After all, if we have an accessor/modifier pair, why do we have to have a physical data member to tie it to? Probably the best way to illustrate this idea is to show how we can use virtual data members to handle the C++ pimpl idiom:

 
#pragma db object
class person
{
public:
  const std::string& name () const;
  void name (const std::string&);
 
  unsigned short age () const;
  void age (unsigned short) const;
 
private:
  struct impl;
 
  #pragma db transient
  impl* pimpl_;
 
  #pragma db member(name) virtual(std::string)
  #pragma db member(age) virtual(unsigned short)
 
  ...
};
 

Besides the pimpl idiom, virtual data members can also be useful to aggregate or dis-aggregate real data members and to handle third-party types for which names of real data members may not be known.

Note also that virtual data members have nothing to do with C++ virtual functions or virtual inheritance. Specifically, no virtual function call overhead is incurred when using virtual data members.

For more information on virtual data members, refer to Section 12.4.13, “virtual” in the ODB manual as well as the access and pimpl examples in the odb-examples package.

Database Indexes

ODB now supports defining database indexes within the pragma language. If all you need is a simple index on a particular data member (simple or composite), then all you have to do is specify either the index (for non-unique index) or unique (for unique index) pragma. For example:

 
#pragma db object
class person
{
  ...
 
  #pragma db unique
  std::string name_;
 
  #pragma db index
  unsigned short age_;
};
 

It is also possible to define an index on more than one member as well as to give it a custom name:

 
#pragma db object
class person
{
  ...
 
  std::string first_;
  std::string last_;
 
  #pragma db index("name_i") unique members(first_, last_)
};
 

ODB also supports database-specific index types, methods, and options. Here is an example of a more involved PostgreSQL-specific index definition:

 
#pragma db object
class person
{
  ...
 
  std::string name_;
 
  #pragma db index                            \
             type("UNIQUE CONCURRENTLY")      \
             method("HASH")                   \
             member(name_, "DESC")            \
             options("WITH(FILLFACTOR = 80)")
};
 

For more information on defining database indexes, refer to Section 12.6, “Index Definition Pragmas” in the ODB manual.

Mapping Extended Database Types

Besides the standard integers, strings, and BLOBs, most modern database implementations also provide a slew of extended SQL types. Things like geospatial types, user-defined types, collections (arrays, table types, etc), key-value stores, XML, JSON, etc. While ODB does not support such extended types directly (it would take years to cover all the types in all the databases), it now includes a mechanism which, with a bit of effort, allows you to map pretty much any extended SQL type to any C++ type.

This is a really big and powerful feature. As a result, I wrote a separate post that is dedicated just to Extended Database to C++ Type Mapping. It provides much more detail and some cool examples. There is also Section 12.7, “Database Type Mapping Pragmas” in the ODB manual.

Profile Library Improvements

Both Boost and Qt profile libraries now include persistence support for their respective UUID types. By default, these types are mapped to a UUID SQL type if the database provides such a type (e.g., UUID in PostgreSQL and UNIQUEIDENTIFER in SQL Server) or to a suitable 16-byte binary type otherwise. As a result, you can now use boost::uuids::uuid and QUuid in your persistent classes without any extra effort:

 
// Boost version.
//
#pragma db object
class person
{
  ...
 
  boost::uuids::uuid id_;
};
// Qt version.
//
#pragma db object
class Person
{
  ...
 
  QUuid id_;
};
 

For more information on UUID support in Boost, refer to Section 19.6, “Uuid Library” in the ODB manual. For Qt, see Section 20.1, “Basic Types”.

In addition, the Boost profile now includes support for the Multi-Index container library. While there are some interesting implementation details about which I am planning to write in a separate post, from the user perspective, multi_index_container can now be used in persistent classes just as any standard container. For example:

 
namespace mi = boost::multi_index;
 
#pragma db object
class person
{
  ...
 
  typedef
  mi::multi_index_container<
    std::string,
    mi::indexed_by<
      mi::sequenced<>,
      mi::ordered_unique<mi::identity<std::string> >
    >
  > emails;
 
  emails emails_;
};
 

For more information on Multi-Index container support, refer to Section 19.3, “Multi-Index Container Library” in the ODB manual.

Combined Database Schema

The ODB compiler now supports the generation of the combined SQL file from multiple header files. For example:

 
odb ... --generate-schema-only --at-once --output-name schema \
employee.hxx employer.hxx
 

The result of the above command will be the schema.sql file that contains database creation code (DLL statements) for persistent classes defined in both employee.hxx and employer.hxx headers.

A combined SQL file can be easier to work with, for example, send to a DBA for review. It can also be useful when dealing with circular dependencies, as discussed in Section 6.3 “Circular Relationships” in the ODB manual.

C++11 std::array to BLOB Mapping

ODB now includes built-in support for mapping C++11 std::array<char, N> and std::array<unsigned char, N> types to BLOB/BINARY database types. For example:

 
#pragma db object
class person
{
  ...
 
  #pragma db type("BINARY(1024)")
  std::array<char, 1024> pub_key_;
};
 

SQLite Support Improvements

On Windows, SQLite ODB runtime now supports persistence of std::wstring. You can also pass the database name as std::wstring in addition to std::string. The odb::sqlite::database class constructors have also been extended to accept the virtual filesystem (vfs) module name. Finally, the default SQLite mapping for float and double now allows the NULL value since SQLite treats NaN values as NULL.

ODB License Exception

Wednesday, August 22nd, 2012

ODB is dual-licensed under the GPL and a proprietary license (there is both a free and a commercial version of the latter). As discussed in the ODB Licensing FAQ this model caters fairly well to other open source projects that use the GPL (or a similar license), private closed source projects, as well as commercial closed source projects. However, there is a significant drawback in this dual-licensing model when it comes to other open source projects that use a more liberal (than the GPL) license. In this post I would like to discuss the nature of this problem in more detail as well as explain what we did to allow more liberally-licensed projects to still use and benefit from ODB.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris and were tested with GNU g++, MS Visual C++, Sun CC, and Clang.

Before we go into the details of the ODB situation, let’s first get a quick overview of the state of the open source software licensing. All open source software licenses can be broadly divided into two categories, the so-called copyleft and non-copyleft licenses. Copyleft licenses give us the right to distribute copies and derivative works of software but require that the same rights be preserved in the derivative works. In other words, it is not possible to take a copyleft-licensed project, say a library, use it in our own project (thus forming a derivative work) and then release our project as closed source. In contrast, non-copyleft licenses allow us to base proprietary, closed source works on the non-copyleft software. The most popular copyleft license is the GNU General Public License (GPL). For non-copyleft, several licenses are widely used, including BSD, MIT, and the Apache License.

A quick note on the term derivative used in the preceding paragraph. Many people mistakenly assume it to mean only modifications to the original software. In reality, derivative work is a much broader concept. In our context, it is more useful to think of it as covering any functional dependency between two pieces of software. For example, if we use a GUI library in our application to display some information, then our application is a derivative work of that GUI library.

It is also useful to understand philosophical grounds of the copyleft vs non-copyleft debate. Proponents of the copyleft licenses want to preserve the freedom of every end-user to be able to distribute and modify the software. And if derivative works can be released closed source, then such a freedom would be denied. In contrast, the non-copyleft camp believes that the choice to release derivative works open or closed source is as essential a freedom as any other. There is also a third group, let’s call them the quid pro quo camp, which prefers the copyleft licenses because they ensure that those who benefit from their work are also required to give something back (i.e., make their work open source). Dual-licensing software under a copyleft license as well as a proprietary license on the commercial basis is then a natural extension of the quid pro quo idea. This way, companies that are unable or unwilling to make their derivative works open source have the option to instead pay a fee for a proprietary license. These fees can then be used to fund further development of the original project.

A copyleft project, say GPL-licensed, can normally use non-copyleft software, say BSD-licensed, without imposing any further restrictions on its users. This is not the case, however, in the other direction, for example, a BSD-licensed project using GPL-licensed software. While releasing the derivative work under BSD will satisfy the requirements of the GPL, the resulting whole now carries the restrictions of both BSD and the GPL. And that means that the users of the project no longer have the freedom to use it in proprietary derivative works.

If we are proponents of the copyleft licenses for the sake of preserving the end-user freedoms, then this outcome is exactly what we would want. However, if we use a copyleft license as a way to implement the quid pro quo principle, then this is not quite what we had in mind. After all, the author of the project is giving something back by releasing his software under an open source license. Further restricting what others can do with this software is not something that we should probably attempt.

And that’s exactly the problem that we faced with ODB. We are happy to let everyone use ODB in their projects as long as they make them open source under any open source license, copyleft or non-copyleft. However, as we have just discussed, the standard terms of the GPL make ODB really unattractive to non-copyleft projects.

So what we decided to do is to offer to grant a GPL license exception to any specific open source project that uses any of the common non-copyleft licenses (BSD, MIT, Apache License, etc). This exception would allow the project to use ODB without any of the GPL copyleft restrictions. Specifically, the users of such a project would still be able to use it in their closed source works even though the result would depend on ODB.

You may be wondering why didn’t we just grant a generic license exception that covers all the open source projects? Why do we need to issue exceptions on the project-by-project basis? The reason for this is because a simple generic exception would be easy to abuse. For example, a company wishing to use ODB in a closed source application could just package and release the generated database support code without any additional functionality as an open source project. It could then go ahead and use that code in a closed source application without any of the GPL restrictions. While it is possible to prevent such abuse using clever legal language, we found that a complex license exception text will only confuse things. Instead we decided to go with a very straightforward license exception text and to offer it to any open source project that looks legitimate.

In fact, the other day we granted our first exception. It was for the POLARIS Transportation Modelling Framework developed by Argonne National Laboratory (U.S. Department of Energy) and released under the BSD license.

So what do you need to do if you want an ODB license exception for your project? The process is actually very easy. Simply email us at info@codesynthesis.com the following information about your project:

  1. Project name
  2. Project URL
  3. License used (e.g., BSD, MIT, etc)
  4. Copyright holder(s)

Once we receive this, we will send you the license exception text that will be specific to your project. You can then add this text as a separate file to your source code distribution along with the LICENSE, COPYING, or a similar file. Or you can incorporate the exception text directly into one of these files. We also store each exception granted in our publicly-accessible ODB source code repository.

C++ Event Logging with SQLite and ODB

Tuesday, August 7th, 2012

At some point most applications need some form of logging or tracing. Be it error logging, general execution tracing, or some application-specific event logging. The former two are quite common and a lot of libraries have been designed to support them. The latter one, however, normally requires a custom implementation for each application since the data that we wish to log, the media on which the logs are stored, and the operations that we need to perform on these logs vary greatly from case to case. In this post I am going to show how to implement event logging in C++ applications using the SQLite relational database and the ODB C++ ORM.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris and were tested with GNU g++, MS Visual C++, Sun CC, and Clang.

When I said that we will be storing our events in a relational database, some of you might have thought I am crazy. But SQLite is no ordinary relational database. First of all, SQLite is not your typical client-server RDBMS. Rather, it is an extremely small and fast embedded database. That is, it is linked to your application as a library and the actual database is just a file on a disk. And, as a bonus, SQLite also supports in-memory databases.

Ok, maybe you are still not convinced that SQLite is a good fit for event logging. Consider then the benefits: With a relational database we can easily store statically-typed data in our event records. With SQLite we can store the log as a file on a disk, we can have an in-memory log, or some combination of the two (SQLite allows us to save an in-memory database to a file). And don’t forget that SQLite is fully ACID-compliant. This can be especially important for being able to audit the logs with any kind of certainty that they are accurate. Finally, when it comes to operations that we can perform on our logs, it is hard to beat the generality offered by an RDBMS for data querying and manipulation.

Maybe you are almost convinced, but there is just one last problem: who wants to deal with all that SQL from within a C++ application? Well, that’s where ODB comes to the rescue. As we will see shortly, we can take advantage of all these benefits offered by SQLite without having a single line of SQL in our code. Note also that while the below discussion is focused on SQLite, nothing prevents you from using another RDBMS instead. For example, if you needed to store your logs in a central network location, then you could instead use a client-server database such as PostgreSQL.

But enough talking. Let’s see the code. A good example of an application that usually requires logging is a game server. The log could contain, for example, information about when players entered and left the game. However, for this post I’ve decided to use something less typical: an automated stock trading system. Here we would like to keep an accurate log of all the buy/sell transactions performed by the system. This way we can later figure out how come we lost so much money.

Our event record, as a C++ class, could then look like this (I am omitting accessors and instead making all the data members public for brevity):

 
enum order_type {buy, sell};
 
class event
{
public:
  event (order_type type,
         const std::string& security,
         unsigned int qty,
         double price);
 
  order_type order;
  std::string security;
  unsigned int qty;
  double price;
  boost::posix_time::ptime timestamp;
};
 
std::ostream& operator<< (std::ostream&, const event&);
 

The event class constructor initializes the first four data members with the passed values and sets timestamp to the current time. Here I choose to use Boost date-time library to get a convenient time representation. Other alternatives would be to use QDateTime from Qt or just have an unsigned long long value that stores time in microseconds.

Ok, we have the event class but how do we store it in an SQLite database? This is actually pretty easy with ODB. The only two modifications that we have to make to our class are to mark it as persistent and add the default constructor, which we can make private:

 
#pragma db object no_id
class event
{
  ...
 
private:
  friend class odb::access;
  event () {}
};
 

The friend declaration grants ODB access to the private default constructor, and to the data members, in case we decide to make them private as well.

You may also be wondering what that no_id clause at the end of #pragma db means? This expression tells the ODB compiler that this persistent class has no object id. Normally, persistent classes will have one or more data members designated as an object id. This id uniquely identifies each instance of a class in the database. Under the hood object ids are mapped to primary keys. However, there are cases where a persistent class has no natural object id and creating an artificial one, for example, an auto-incrementing integer, doesn’t add anything other than performance and storage overheads. Many ORM implementations do not support objects without ids. ODB, on the other hand, is flexible enough to support this use-case.

Our persistent event class is ready; see the event.hxx file for the complete version if you would like to try the example yourself. The next step is to compile this header with the ODB compiler to generate C++ code that actually does the saving/loading of this class to/from the SQLite database. Here is the command line:

odb -d sqlite -s -q -p boost/date-time event.hxx

Let’s go through the options one-by-one: The -d (--database) option specifies the database we are using (SQLite in our case). The -s (--generate-schema) option instructs the ODB compiler to also generate the database schema, that is the SQL DDL statements that create tables, etc. By default, for SQLite, the schema is embedded into the generated C++ code. The -q (--generate-query) option tells the ODB compiler to generate query support code. Later we will see how to use queries to analyze our logs. However, if your application doesn’t perform any analysis or if this analysis is performed off-line by another application, then you can omit this option and save on some generated code. Finally, the -p (--profile) option tells the ODB compiler to include the boost/date-time profile. We need it because we are using boost::posix_time::ptime as a data member in our persistent class (see ODB Manual for more information on ODB profiles).

The above command will produce three new files: event-odb.hxx, event-odb.ixx, and event-odb.cxx. The first file we need to #include into our source code and the third file we need to compile and link to our application.

Now we are ready to log some events. Here is an example that opens an SQLite database, re-creates the schema, and then logs a couple of events. I’ve added a few calls to sleep() so that we can get some time gaps between records:

 
void log (odb::database& db, const event& e)
{
  transaction t (db.begin ());
  db.persist (e);
  t.commit ();
}
 
int main ()
{
  odb::sqlite::database db (
    "test.db",
    SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE);
 
  // Create the database schema.
  //
  {
    transaction t (db.begin ());
    schema_catalog::create_schema (db);
    t.commit ();
  }
 
  // Log some trading events.
  //
  log (db, event (buy,  "INTC", 100, 25.10));
  log (db, event (buy,  "AMD",  200, 4.05));
  log (db, event (buy,  "ARM",  200, 26.25));
  sleep (1);
  log (db, event (sell, "AMD",  50,  5.45));
  log (db, event (buy,  "ARM",  300, 25.35));
  log (db, event (sell, "INTC", 100, 24.45));
  sleep (1);
  log (db, event (sell, "AMD",  100, 4.45));
  log (db, event (sell, "ARM",  150, 27.75));
}
 

We are now ready to build and run our example (see driver.cxx for the complete version). Here is the command line I used on my GNU/Linux box:

 
g++ -o driver driver.cxx event-odb.cxx -lodb-boost \
  -lodb-sqlite -lodb -lboost_date_time
 

If we now run this example, it takes a few seconds (remember the sleep() calls) and finishes without any indication of whether it was successful or not. In a moment we will add some querying code to our example which will print returned log records. In the meantime, however, we can fire up the sqlite3 utility, which is a command line SQL client for SQLite, and examine our database:

 
$ sqlite3 test.db
sqlite> select * from event;
 
0|INTC|100|25.1|2012-07-29 15:56:39.317903
0|AMD|200|4.05|2012-07-29 15:56:39.318812
0|ARM|200|26.25|2012-07-29 15:56:39.319459
1|AMD|50|5.45|2012-07-29 15:56:40.320095
0|ARM|300|25.35|2012-07-29 15:56:40.321089
1|INTC|100|24.45|2012-07-29 15:56:40.321703
1|AMD|100|4.45|2012-07-29 15:56:41.322475
1|ARM|150|27.75|2012-07-29 15:56:41.323717
 

As you can see, the database seems to contain all the records that we have logged.

Ok, logging was quite easy. Let’s now see how we can run various queries on the logged records. To start, a simple one: find all the transactions involving AMD:

 
typedef odb::query<event> query;
typedef odb::result<event> result;
 
transaction t (db.begin ());
 
result r (db.query<event> (query::security == "AMD"));
 
for (result::iterator i (r.begin ()); i != r.end (); ++i)
  cerr << *i << endl;
 
t.commit ();
 

Or, a much more elegant C++11 version of the same transaction:

 
transaction t (db.begin ());
 
for (auto& e: db.query<event> (query::security == "AMD"))
  cerr << e << endl;
 
t.commit ();
 

If we add this fragment at the end of our driver and run it, we will see the following output:

 
2012-Jul-29 16:03:05.585545 buy 200 AMD @ 4.05
2012-Jul-29 16:03:06.587784 sell 50 AMD @ 5.45
2012-Jul-29 16:03:07.590787 sell 100 AMD @ 4.45
 

What if we were only interested in transactions that happened in, say, the last 2 seconds. That’s also easy:

 
ptime tm (microsec_clock::local_time () - seconds (2));
 
result r (db.query<event> (query::security == "AMD" &&
                           query::timestamp >= tm));
 
for (result::iterator i (r.begin ()); i != r.end (); ++i)
  cerr << *i << endl;
 

Again, if we add this code to the driver and run it, the output will change to:

 
2012-Jul-29 16:03:06.587784 sell 50 AMD @ 5.45
2012-Jul-29 16:03:07.590787 sell 100 AMD @ 4.45
 

I think you get the picture: with ODB we can slice and dice the log in any way we please. We can even select a subset of the record members or run aggregate queries thanks to an ODB feature called views.

Another operation that we may want to perform from time to time is to trim the log. The following code will delete all the records that are older than 2 seconds:

 
ptime tm (microsec_clock::local_time () - seconds (2));
db.erase_query<event> (query::timestamp < tm);
 

What about multi-threaded applications? Can we log to the same database from multiple threads simultaneously. The answer is yes. While it takes some effort to make an SQLite database usable from multiple threads, ODB takes care of that for us. With the release of SQLite 3.7.13, now we can even share an in-memory database among multiple threads (see this note for details).

As you can see, put together, SQLite and ODB can make a pretty powerful yet very simple to setup and use event logging system. It is fast, small, multi-threaded, and ACID compliant. For more information on ODB, refer to the ODB project page or jump directly to the ODB Manual.