A Sense of Design

Archive for the ‘ORM’ Category

ODB packages for RHEL and Fedora

Wednesday, February 26th, 2014

Just a quick note to let you know that ODB 2.3.0 packages for RHEL 5 and 6 as well as Fedora 19-21 are now available from the official repositories (EPEL and Fedora). See the mailing list announcement for all the technical details.

While this is great news and a major achievement for ODB (especially considering that there are 7 packages comprising the ODB system), what saddens me somewhat, as a Debian user myself, is that there are no such packages for Debian/Ubuntu yet. In fact, there is even no effort under way. All we have is a “wishlist” bug report for ODB.

At the same time I believe it should now be very straightforward to package ODB for Debian. The system was designed from the grounds up to be packager-friendly (that’s why it is split into multiple packages instead of being a one monolithic block). Also, the packaging efforts first for Gentoo and then for EL/Fedora ironed out a lot of kinks. So if anyone is interested in packaging ODB for Debian, let us know (post to the odb-users mailing list). I will personally assist you in any way I can, as I did in the past for the Gentoo and EL/Fedora packaging efforts.

Posted in ORM, C++ | Comments Off

ODB 2.3.0 Released

Wednesday, October 30th, 2013

ODB 2.3.0 was released today. It has been 8 months of development and you may be wondering what took us so long. And the answer to this is database schema evolution.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and without manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris. Supported C++ compilers include GCC, MS Visual C++, Sun CC, and Clang.

While there are a couple of other major new features in this release (discussed below), the most important one is undoubtedly support for database schema evolution. I am not going to talk much about the technical side of it in this post. The feature is really big and requires quite a bit of background concepts before one can realize how everything fits together. So, instead, if you are looking to just get a taste of what’s available, I will send you to the Changing Persistent Classes section in the “Hello World” chapter. Otherwise, there is the Database Schema Evolution chapter that covers everything in detail, including the rationale for why things are done a certain way. I feel this is the best piece of documentation I have written thus far.

But, instead of discussing the technicals, let me tell you the story of database schema evolution in ODB. While this feature is now finished and ready and the whole thing may even seem fairly effortless (hey, 8 months is not that bad), it could have been the other way around. That is, it could have just as easily “finished me”, so to speak, since I was on the brink of giving up on a couple of occasions. Good software design is hard.

I started thinking about database schema evolution support in ODB after presenting at Boostcon 2011. There, I don’t think I was even 5 minutes into my talk, when I got “the question:” but what about database schema evolution? At that time all I could say was that I was thinking about it.

And so I started thinking about schema evolution in the background. You know, when you drive somewhere, or when you are in the shower, the subconscious is trying to make sense of things.

After about six months of this background thinking I thought I was ready so I decided to take a “full tilt” stab at schema evolution. The way I do this is sit for days with a pen (Pilot G-Tec-C4 0.4mm) and paper (calculation pad, 5mm squares) and write down ideas, problems, possible solutions, etc., until things start crystallizing. This time, however, they just weren’t crystallizing. Instead I was going deeper and deeper into various rabbit holes. So I decided to let it sit in the background some more.

My next attempt at database schema evolution started a couple of months before the C++Now 2013 conference where I wanted to have a good answer to “the question”. This time the design process went better. I simplified quite a few things and also bouncing ideas off and trying to explain them to a couple of friends helped a lot. So for C++Now I had the schema migration support fully implemented and was hoping to finish everything off and make it public shortly after the conference. After my presentation, a number of people came up to me and said that they really liked the overall design, so I felt encouraged.

The only thing I needed to finish was support for data migration. The tricky part here is that when migrating data from one database version to the next, we need to be able to load old data while storing new data. In other words, our C++ object model must be able to work with both versions of the database at the same time. Sounds tricky, but I had a solution: object sections (discussed below). The idea is to partition data members corresponding to different schema versions into sections that are only loaded or updated if we are working with specific database versions. This sounded nice and clean since I was reusing a generally useful mechanism (object sections) to implement schema evolution. So I went ahead and implemented support for object sections.

Once I was done with object sections, I went back to schema evolution and quickly realized that things are not as nice and clean as I originally thought. The problem is, sections only deal with loads and updates but not persists or erases. In other words, a data member belonging to a section will only be loaded or updated if the section is loaded or updated but will always be persisted and erased. And that meant I won’t be able to persist new objects or erase old ones when working with older databases. So, it turned out, sections were a no go and I needed a more general mechanism for data member versioning. This was the point where I seriously considered just not supporting database schema evolution at all.

In the end I managed to design and implement a much more orthogonal mechanism for data member versioning so that we can even use it inside sections (something that wouldn’t have been possible in the original design). But it took another couple of months of hard work. And then it took another couple of weeks to document the whole thing (almost 40 pages). So, good software design is hard. Striving for great design is agony.

Ok, that was a slightly different perspective on one of the features in ODB. I hope you enjoyed it. Let’s now see what other interesting new features are in 2.3.0. This time I promise I will stick to the technical side.

C++11 Enum Class

ODB now automatically maps C++11 enum classes in addition to
C++98 enums. For example:

enum class color: unsigned char {red, green, blue};
 
#pragma db object
class object
{
  ...
 
  color color_; // Mapped to SMALLINT in PostgreSQL.
};

For more information on the default mapping of C++11 enums, refer to the ODB manual “Type Mapping” sections for each database.

Defining Composite Values Inside Persistent Classes

ODB now supports defining composite value types inside persistent classes, views, and other composite values. For example:

#pragma db object
class person
{
  ...
 
  #pragma db value
  struct name
  {
    std::string first;
    std::string last;
  };
 
  #pragma db id
  name name_;
};

Object Sections

Object sections are an optimization mechanism that allows us to partition data members of a persistent class into groups that can be independently loaded and/or updated. This can be useful, for example, if an object contains expensive to load or update data members (such as BLOBs or containers) and that are accessed or modified infrequently. For example:

#pragma db object
class person
{
  ...
 
  #pragma db load(lazy) update(manual)
  odb::section keys_;
 
  #pragma db section(keys_) type("BLOB")
  char public_key_[1024];
 
  #pragma db section(keys_) type("BLOB")
  char private_key_[1024];
};

The keys_ section above is lazily loaded and manually updated. Here is how we could use it:

transaction t (db.begin ());
 
auto_ptr<person> p (db.load<person> (...)); // Keys not loaded.
 
if (need_keys)
{
  db.load (*p, p->keys_); // Load keys.
  // Use keys ...
}
 
db.update (*p); // Keys not updated.
 
if (update_keys)
{
  // Change keys ...
  db.update (*p, p->keys_); // Update keys.
}
 
t.commit ();

A section can be any combination of eager or lazy loaded and always, change, or manually updated. An eager-loaded section is always loaded as part of the object load. A lazy-loaded section is not loaded as part of the object load and has to be explicitly loaded with the database::load() function (as shown above) if and when necessary.

An always-updated section is always updated as part of the object update, provided it has been loaded. A change-updated section is only updated as part of the object update if it has been loaded and marked as changed. A manually-updated section is never updated as part of the object update and has to be explicitly updated with the database::update() function (as shown above) if and when necessary.

Here is an example of an eager-loaded, change-updated section:

#pragma db object
class person
{
  ...
 
  typedef std::array<char, 1024> key_type;
 
  void
  public_key (const key_type& k)
  {
    public_key_ = k;
    keys_.change ();
  }
 
  void
  private_key (const key_type& k)
  {
    private_key_ = k;
    keys_.change ();
  }
 
  #pragma db update(change)
  odb::section keys_;
 
  #pragma db section(keys_) type("BLOB")
  key_type public_key_;
 
  #pragma db section(keys_) type("BLOB")
  key_type private_key_;
};

And here is how we can use this section:

transaction t (db.begin ());
 
auto_ptr<person> p (db.load<person> (...)); // Keys loaded.
 
db.update (*p); // Keys not updated since not changed.
 
p->public_key (new_public_key);
 
db.update (*p); // Keys updated since changed.
 
t.commit ();

For more information on this feature, refer to Chapter 9, “Sections” in the ODB manual as well as the section example in the odb-examples package.

There are also other interesting new features in this release. For the complete list of changes, see the official ODB 2.3.0 announcement.

Posted in ORM, C++ | Comments Off

ODB 2.2.0 Released

Wednesday, February 13th, 2013

ODB 2.2.0 was released today. As you might have noticed, for the past several months I’ve been neglecting this blog. This release will hopefully explain my lack of attention.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris. Supported C++ compilers include GCC, MS Visual C++, Sun CC, and Clang.

This release packs quite a few major new features, including multi-database support, prepared queries, change-tracking containers, and automatic mapping for char[N]. As always, below I am going to examine these and other notable new features in more detail. For the complete list of changes, see the official ODB 2.2.0 announcement.

But before we go into detail on the new ODB features, let me mention a few other major improvements and additions. First, of all, there is support for Qt5 (in addition to Qt4). This includes VC++ project/solution files, autotools build system updates, code fixes, etc. We have also thoroughly tested Qt5 support on GNU/Linux, Windows (VC++ 10.0), and MinGW-W64.

Generally, a lot of testing goes into an ODB release. Just think of all the combinations: five databases, more than ten compilers (including versions), two C++ modes (98 and 11), and multiple architecture (x86, x86_64, arm, SPARC). Just building and running all the tests on Windows takes several days. The official announcement mentioned above has a cool table that lists all the combinations tested.

We have also upgraded the private copy of GCC that is used by the ODB compiler binary packages from 4.5.1 to 4.7.2 (actually, it is the 4.7.3 pre-release) on all the platforms. This should make a difference to projects wishing to use some of the C++11 features that weren’t yet implemented in GCC 4.5.

Another notable addition is the “official” support for mobile/embedded systems. We have always kept this class of applications in mind by making sure the footprint stays small and by providing fine-grained control over the optional features used. With this release we have added Raspberry Pi ARM GNU/Linux computer as one of the test targets. We’ve also written a detailed, step-by-step guide on using ODB on Mobile and Embedded Systems. At the end of this guide, there are some indicative footprint numbers. Check it out.

More good news, thanks to the debugging efforts of Hugo Mildenberger, ODB now works with MariaDB. While the MariaDB project claims that it is “a drop-in replacement for MySQL”, they have actually re-implemented the MySQL client library. As you can imagine, it is hard not to end up with some subtle differences in behavior.

Finally, I am planning to give a talk on ODB (and another one on my experience adding C++11 support to ODB) at the 2013 C++Now conference. Hope to see you all there.

Multi-Database Support

There is a pretty long list of features that we would like ODB to support. There is also a much shorter list of Big Features, things like database schema evolution or the SQL-to-C++ compiler. Well, with the release of 2.2.0, I am happy to say this latter list got one item shorter.

Multi-database support allows you to use multiple database systems (e.g., MySQL, SQLite, etc.) from the same application. It comes in two flavors: static and dynamic.

With static support you use the concrete database interfaces, that is odb::mysql::database, odb::sqlite::database, etc. Here is a quick example:

#include "person.hxx"
#include "person-odb-mysql.hxx"
#include "person-odb-sqlite.hxx"
 
odb::mysql::database my_db ("john", "secret", "test_db");
odb::sqlite::database sl_db ("/tmp/test_db");
 
person p ("John", "Doe");
 
{
  odb::mysql::transaction t (my_db.begin ());
  my_db.persist (p); // Store in MySQL.
  t.commit ();
 
{
  odb::sqlite::transaction t (sl_db.begin ());
  sl_db.persist (p); // Store in SQLite.
  t.commit ();
}

With dynamic support the same application code can access multiple databases via a common interface. Here is how we can change the above example to take advantage of the dynamic support:

#include "person.hxx"
#include "person-odb.hxx"
 
void
store (odb::database& db, const person& p)
{
  odb::transaction t (db.begin ());
  db.persist (p);
  t.commit ();
}
 
odb::mysql::database my_db ("john", "secret", "test_db");
odb::sqlite::database sl_db ("/tmp/test_db");
 
person p ("John", "Doe");
 
store (my_db, p); // Store in MySQL.
store (sl_db, p); // Store in SQLite.

As you may have noticed, the idea behind static and dynamic multi-database support is conceptually the same as that of ordinary and virtual functions. If the functions are not virtual, then we have to use the derived type to make sure the correct one is called. In contrast, with virtual functions, we can use the common base.

The advantages and disadvantages of static vs dynamic support are also similar to those of ordinary and virtual functions. In particular, the dynamic support adds a slight overhead and you cannot use database-specific features, extensions, etc. The main advantage of dynamic multi-database support is the database-independence of the code that we write. The same application code will work with any database system supported by ODB. Dynamic support also allows the application to dynamically load the database code for individual databases if and when necessary.

Note also that it is possible to mix and match static and dynamic support in the same application. In fact, dynamic support is built on top of static support so it is possible to use the same database both “statically” and “dynamically”. In particular, this ability to “drop down” from dynamic to static support can be used to access database specific features and extensions.

Well, this is just a quick, high-level overview of this feature. The ODB manual has a whole chapter dedicated to multi-database support.

Prepared Queries

When you call persist() or load(), ODB uses (and caches) prepared statements to perform the database operation. For queries, that is, the query() call, however, ODB has no knowledge about how many times the same query will be performed and so does not cache the statement. This can be a significant overhead if you are executing the same query many times. For example:

typedef odb::query<person> query;
typedef odb::result<person> result;
 
unsigned short age;
query q (query::age > query::_ref (age));
 
for (age = 1; age < 100; age++)
{
  // Prepare, execute, release.
  result r (db.query<person> (q));
  ...
}

To overcome this we have added support for prepared queries. They are a thin wrapper around the underlying database system’s prepared statements functionality and provide a way to perform potentially expensive query preparation tasks only once and then execute the query multiple times. Here is how we can improve the above example:

typedef odb::query<person> query;
typedef odb::prepared_query<person> prep_query;
typedef odb::result<person> result;
 
unsigned short age;
query q (query::age > query::_ref (age));
 
transaction t (db.begin ());
 
// Prepare.
prep_query pq (db.prepare_query<person> ("person-age-query", q));
 
for (age = 1; age < 100; age++)
{
  // Execute.
  result r (pq.execute ());
  ...
}
t.commit (); // Release (or when pq goes out of scope).

What if we wanted to use a prepared query at different times and/or different places in our application? To support this, ODB allows you to cache the prepared query on the connection. Alternatively, you can provide a factory (can be a C++11 lambda) that will prepare a query when/if necessary. This is especially useful for multi-threaded applications since prepared queries are database connection-specific.

For more information on this feature, see Section 4.5, “Prepared Queries” in the ODB manual. There is also the prepared example in the odb-examples package that has a nice run down of various scenarios.

Change-Tracking Containers

Let’s say we are working on an object model for a bug tracker. A bug report might be represented with the following persistent class:

#pragma db object
class bug
{
  ...
 
  #pragma db id auto
  unsigned int id_;
 
  std::string summary_;
  std::string description_;
 
  std::vector<std::string> comments_;
};

Let’s say someone added a comment and we need to update the bug object in the database. Let’s also assume the report already includes quite a few comments. How many database statements will ODB execute to synchronize the state?

ODB has no knowledge of which elements in the comments vector were added, removed, or modified (ODB does not store original versions of objects to perform change tracking though this can be supported with a custom session; see below). So ODB has no choice but to assume the worst case scenario (everything has changed) and completely re-create the state. This can be a significant overhead in our case, since all we did was add one element and a single INSERT statement would have sufficed.

To overcome this, ODB now supports change-tracking containers. They do exactly what the name says: they track changes you make and then use that information to minimize the number of database operations necessary to synchronize the container state with the database. In this release ODB provides change-tracking equivalents for std::vector and QList, called odb::vector and QOdbList, respectively. Equivalents for other containers are planned for future versions. It is also possible to add change tracking support for a custom container. In the scenario described above, if we replace std::vector with odb::vector, the update operation will result in exactly one INSERT.

A change-tracking equivalent can be used instead of the original with minimal changes. Underneath, the change-tracking version uses the original to store the elements. It is implicit-convertible to and implicit-constructible from the original. This means, for example, that you can pass odb::vector anywhere const odb::vector& is expected. The change state overhead is also minimized. For ordered containers it is just 2 bits per element.

In fact, 95% of the interface is exactly the same as the original. Probably the major difference is the removal of the non-const element access operators: operator[] in the container and operator* and operator-> in the non-const iterator (const versions are still available). Instead, they were replaced with the explicit modify() functions.

The ODB manual has a lot more on this feature (Section 5.4, “Change-Tracking Containers”). Sections 5.4.1, “Change-Tracking vector“ and 22.3.1, “Change-Tracking QList“ have more information on the interface differences.

Automatic `char[N]` Mapping

Another neat feature is automatic mapping for char[N] (and wchar_t[N] for SQLite on Windows and SQL Server). By default, the database type used is VARCHAR(N-1) (or equivalent; -1 is for the '\0' terminator), though you can change that to, say, CHAR(N) or VARCHAR(N) if you don’t need the terminator; ODB supports both ways. Here is an example:

#pragma db object
class address
{
  ...
 
  char street[64]; // VARCHAR(63)
 
  #pragma db type("CHAR(2)")
  char state[2];
};

This should be especially useful for mobile/embedded systems where you may want to avoid any dynamic allocations (for example, cause by std::string) in your objects.

You can also use C++11 std::array<char, N> instead of char[N], though you will need to provide the mapping explicitly. For example:

#pragma db object
class address
{
  ...
 
  #pragma db type("VARCHAR(63)")
  std::array<char, 64> street;
 
  #pragma db type("CHAR(2)")
  std::array<char, 2> state;
};

For more information on this new mapping, refer to the ODB manual “Type Mapping” sections for each database system.

Some other cool features in this release that I would like to mention briefly include support for automatically-derived SQL name transformations. You can now add prefixes/suffixes to table, column, index, and sequence names, convert them to upper/lower case, or do custom regex-based transformations.

Then there is support for custom sessions that I mentioned above. The common/session/custom test in the odb-tests package demonstrates how to implement a custom session that performs change tracking (by keeping the originals) and automatic flushing of the modified objects. Custom sessions are a very powerful extension mechanism if you are into those kind of things.

Posted in ORM, C++ | 2 Comments »

Archive for the ‘ORM’ Category

ODB packages for RHEL and Fedora

ODB 2.3.0 Released

C++11 Enum Class

Defining Composite Values Inside Persistent Classes

Object Sections

ODB 2.2.0 Released

Multi-Database Support

Prepared Queries

Change-Tracking Containers

Automatic `char[N]` Mapping

Pages

Archives

Categories

Archive for the ‘ORM’ Category

ODB packages for RHEL and Fedora

ODB 2.3.0 Released

C++11 Enum Class

Defining Composite Values Inside Persistent Classes

Object Sections

ODB 2.2.0 Released

Multi-Database Support

Prepared Queries

Change-Tracking Containers

Automatic char[N] Mapping

Pages

Archives

Categories

Automatic `char[N]` Mapping