Archive for October, 2013

ODB 2.3.0 Released

Wednesday, October 30th, 2013

ODB 2.3.0 was released today. It has been 8 months of development and you may be wondering what took us so long. And the answer to this is database schema evolution.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and without manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris. Supported C++ compilers include GCC, MS Visual C++, Sun CC, and Clang.

While there are a couple of other major new features in this release (discussed below), the most important one is undoubtedly support for database schema evolution. I am not going to talk much about the technical side of it in this post. The feature is really big and requires quite a bit of background concepts before one can realize how everything fits together. So, instead, if you are looking to just get a taste of what’s available, I will send you to the Changing Persistent Classes section in the “Hello World” chapter. Otherwise, there is the Database Schema Evolution chapter that covers everything in detail, including the rationale for why things are done a certain way. I feel this is the best piece of documentation I have written thus far.

But, instead of discussing the technicals, let me tell you the story of database schema evolution in ODB. While this feature is now finished and ready and the whole thing may even seem fairly effortless (hey, 8 months is not that bad), it could have been the other way around. That is, it could have just as easily “finished me”, so to speak, since I was on the brink of giving up on a couple of occasions. Good software design is hard.

I started thinking about database schema evolution support in ODB after presenting at Boostcon 2011. There, I don’t think I was even 5 minutes into my talk, when I got “the question:” but what about database schema evolution? At that time all I could say was that I was thinking about it.

And so I started thinking about schema evolution in the background. You know, when you drive somewhere, or when you are in the shower, the subconscious is trying to make sense of things.

After about six months of this background thinking I thought I was ready so I decided to take a “full tilt” stab at schema evolution. The way I do this is sit for days with a pen (Pilot G-Tec-C4 0.4mm) and paper (calculation pad, 5mm squares) and write down ideas, problems, possible solutions, etc., until things start crystallizing. This time, however, they just weren’t crystallizing. Instead I was going deeper and deeper into various rabbit holes. So I decided to let it sit in the background some more.

My next attempt at database schema evolution started a couple of months before the C++Now 2013 conference where I wanted to have a good answer to “the question”. This time the design process went better. I simplified quite a few things and also bouncing ideas off and trying to explain them to a couple of friends helped a lot. So for C++Now I had the schema migration support fully implemented and was hoping to finish everything off and make it public shortly after the conference. After my presentation, a number of people came up to me and said that they really liked the overall design, so I felt encouraged.

The only thing I needed to finish was support for data migration. The tricky part here is that when migrating data from one database version to the next, we need to be able to load old data while storing new data. In other words, our C++ object model must be able to work with both versions of the database at the same time. Sounds tricky, but I had a solution: object sections (discussed below). The idea is to partition data members corresponding to different schema versions into sections that are only loaded or updated if we are working with specific database versions. This sounded nice and clean since I was reusing a generally useful mechanism (object sections) to implement schema evolution. So I went ahead and implemented support for object sections.

Once I was done with object sections, I went back to schema evolution and quickly realized that things are not as nice and clean as I originally thought. The problem is, sections only deal with loads and updates but not persists or erases. In other words, a data member belonging to a section will only be loaded or updated if the section is loaded or updated but will always be persisted and erased. And that meant I won’t be able to persist new objects or erase old ones when working with older databases. So, it turned out, sections were a no go and I needed a more general mechanism for data member versioning. This was the point where I seriously considered just not supporting database schema evolution at all.

In the end I managed to design and implement a much more orthogonal mechanism for data member versioning so that we can even use it inside sections (something that wouldn’t have been possible in the original design). But it took another couple of months of hard work. And then it took another couple of weeks to document the whole thing (almost 40 pages). So, good software design is hard. Striving for great design is agony.

Ok, that was a slightly different perspective on one of the features in ODB. I hope you enjoyed it. Let’s now see what other interesting new features are in 2.3.0. This time I promise I will stick to the technical side.

C++11 Enum Class

ODB now automatically maps C++11 enum classes in addition to
C++98 enums. For example:

enum class color: unsigned char {red, green, blue};
 
#pragma db object
class object
{
  ...
 
  color color_; // Mapped to SMALLINT in PostgreSQL.
};

For more information on the default mapping of C++11 enums, refer to the ODB manual “Type Mapping” sections for each database.

Defining Composite Values Inside Persistent Classes

ODB now supports defining composite value types inside persistent classes, views, and other composite values. For example:

#pragma db object
class person
{
  ...
 
  #pragma db value
  struct name
  {
    std::string first;
    std::string last;
  };
 
  #pragma db id
  name name_;
};

Object Sections

Object sections are an optimization mechanism that allows us to partition data members of a persistent class into groups that can be independently loaded and/or updated. This can be useful, for example, if an object contains expensive to load or update data members (such as BLOBs or containers) and that are accessed or modified infrequently. For example:

#pragma db object
class person
{
  ...
 
  #pragma db load(lazy) update(manual)
  odb::section keys_;
 
  #pragma db section(keys_) type("BLOB")
  char public_key_[1024];
 
  #pragma db section(keys_) type("BLOB")
  char private_key_[1024];
};

The keys_ section above is lazily loaded and manually updated. Here is how we could use it:

transaction t (db.begin ());
 
auto_ptr<person> p (db.load<person> (...)); // Keys not loaded.
 
if (need_keys)
{
  db.load (*p, p->keys_); // Load keys.
  // Use keys ...
}
 
db.update (*p); // Keys not updated.
 
if (update_keys)
{
  // Change keys ...
  db.update (*p, p->keys_); // Update keys.
}
 
t.commit ();

A section can be any combination of eager or lazy loaded and always, change, or manually updated. An eager-loaded section is always loaded as part of the object load. A lazy-loaded section is not loaded as part of the object load and has to be explicitly loaded with the database::load() function (as shown above) if and when necessary.

An always-updated section is always updated as part of the object update, provided it has been loaded. A change-updated section is only updated as part of the object update if it has been loaded and marked as changed. A manually-updated section is never updated as part of the object update and has to be explicitly updated with the database::update() function (as shown above) if and when necessary.

Here is an example of an eager-loaded, change-updated section:

#pragma db object
class person
{
  ...
 
  typedef std::array<char, 1024> key_type;
 
  void
  public_key (const key_type& k)
  {
    public_key_ = k;
    keys_.change ();
  }
 
  void
  private_key (const key_type& k)
  {
    private_key_ = k;
    keys_.change ();
  }
 
  #pragma db update(change)
  odb::section keys_;
 
  #pragma db section(keys_) type("BLOB")
  key_type public_key_;
 
  #pragma db section(keys_) type("BLOB")
  key_type private_key_;
};

And here is how we can use this section:

transaction t (db.begin ());
 
auto_ptr<person> p (db.load<person> (...)); // Keys loaded.
 
db.update (*p); // Keys not updated since not changed.
 
p->public_key (new_public_key);
 
db.update (*p); // Keys updated since changed.
 
t.commit ();

For more information on this feature, refer to Chapter 9, “Sections” in the ODB manual as well as the section example in the odb-examples package.

There are also other interesting new features in this release. For the complete list of changes, see the official ODB 2.3.0 announcement.