Archive for February, 2015

ODB 2.4.0 Released

Wednesday, February 11th, 2015

ODB 2.4.0 was released today. The big features in this release are bulk operations support and object loading views. There is also a bunch of smaller but still quite important new additions. Read on for the details.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and without manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris. Supported C++ compilers include GCC, MS Visual C++, Sun CC, and Clang.

This release packs quite a few new features, including already mentioned bulk operations (Oracle and SQL Server) and object loading views as well as support for calling MySQL and SQL Server stored procedures, ability to specify join types, and a bunch of other smaller stuff. For the complete list of changes, see the official ODB 2.4.0 announcement. As always, below I am going to examine the more notable new features in more technical detail.

Before we go into detail on the new ODB features, let me mention a few other major improvements and additions. First, of all, there is support for Visual Studio 2013 (VC++ 12.0) including project/solution files for all the runtimes, examples, and tests. We have also upgraded the private copy of GCC that is used by the ODB compiler binary packages from 4.7.2 to 4.9.2 (actually, it is the 4.9.3 pre-release) on all the platforms except Solaris (4.7.2 is already a plenty good match for Sun CC ;-)). This should make a difference to folks wanting to use more of the C++11/C++14 features. Finally, the ODB compiler is now GCC 5-ready.

Bulk Operations

As one potential user told me once, in his company, an ORM that doesn’t support bulk operations is automatically disqualified from any consideration. You can understand why: something like a bulk INSERT can be an order of magnitude faster than doing the same one object at a time. And if you are inserting millions of records, that can make a difference.

If you are not familiar with the notion of bulk operations (or bulk/batch database statement execution, as it is known at the “SQL level”), here is the idea in a nutshell: Some databases (from the ODB-supported ones, only Oracle and SQL Server) allow you to run the same statement (say, INSERT) on multiple sets of data with a single API call. More specifically, instead of providing a set of values for a single row, you provide an array of, say, 10,000 rows. You then call the database API saying that the same statement should be executed for these 10,000 rows. Underneath, we can speculate, the database runtime simply stream all this data to the server without any of the back-and-forth communication that would happen if we were executing a single statement 10,000 times. The result is usually a massive performance improvement.

If you are familiar with the complexity of parameter binding in OCI/ODBC, then you would probably think that binding an array of parameters must take that to a whole new level. And you would be absolutely right. In ODB, however, this is all taken care of under the hood and all you get is a simple API:

std::vector<person> v;
 
// Fill v with people.
 
db.persist (v.begin (), v.end ());

There is, however, one place where the complexity of the underlying bulk statement execution spills over into the ODB interface. And that is error handling. Both Oracle and SQL Server don’t stop when a row in the array causes an error. Instead, they notice the error but keep processing the subsequent rows (strictly speaking, Oracle supports the other mode, but it is pretty much unusable since there is no way to determine at which row it stopped).

So, at the ODB level, after calling a bulk database operation, we have a set of objects and some of them may have failed. How do you report that? In ODB we now have a curious new exception called multiple_exceptions. It contains the error information in the form of other ODB exception for each failed position. The handler for such an exception could look something along these lines:

try
{
  db.persist (v.begin (), v.end ());
}
catch (const odb::multiple_exceptions& me)
{
  for (const auto& e: me)
  {
    size_t p (e.position ());
 
    try
    {
      throw e.exception ();
    }
    catch (const odb::object_already_persistent&)
    {
      cerr << p << ": duplicate: " << v[p].id () << endl;
    }
    catch (const odb::exception& e)
    {
      cerr << p << ": " << e.what () << endl;
    }
  }
}

There is, of course, quite a bit more to bulk operation support in ODB. For more information, refer to Section 15.3, “Bulk Database Operations” in the ODB manual.

Object Loading Views

Object loading views allow us to join and load multiple objects with a single SELECT execution. Let’s just see an example. Say, we have these two persistent classes:

#pragma db object
class employer
{
  ...
};
 
#pragma db object
class employee
{
  ...
  std::shared_ptr<employer> employer_;
};

And now we want to load employers that employ people over the age of 65. Without using an object loading view, we would first have to load all the employees that are seniors. Then get all their employers, and, finally, weed out duplicates. Lots of code, lots of database statement executions (one to find all the seniors and one for each employer).

With an object loading view, things could not be easier:

#pragma db view object(employer) object(employee) query(distinct)
struct employer_view
{
  shared_ptr<employer> er;
};

And to use it:

typedef odb::query<employer_view> query;
 
for (const auto& ev:
       db.query<employer_view> (query::employee::age > 65)
{
  cout << ev.er->name () << endl;
}

Object loading views also allow you to load by-value and even into existing instances. The way this is implemented is actually quite cool. In particular, it allows you to detect the NULL values. See Section 10.2, “Object Loading Views” in the ODB manual for details.

Calling Stored Procedures

ODB now includes support for calling MySQL and SQL Server stored procedures. As you might have guessed, this is done via views. Say, we have a stored procedure called employee_range that returns a list of employees in the specified age range. The SQL Server view would look like this:

#pragma db view query("EXEC employee_range (?)")
struct employee_range
{
  unsigned short age;
  std::string first;
  std::string last;
};

And here is the MySQL version:

#pragma db view query("CALL employee_range((?))")
struct employee_range
{
  unsigned short age;
  std::string first;
  std::string last;
};

Note also that in PostgreSQL calling a stored procedure is done via SELECT and no special ODB support is required:

#pragma db view query("SELECT * FROM employee_range((?))")
struct employee_range
{
  unsigned short age;
  std::string first;
  std::string last;
};

SQLite has no support for stored procedures while Oracle is still a TODO.

For more information on MySQL stored procedures, refer to Section 17.7, “MySQL Stored Procedures” and for SQL Server – Section 21.7, “SQL Server Stored Procedures”.

Join Types

Views now support specifying join types. Want an inner join instead of the default left outer? No problem:

#pragma db view object(employee) object(employer inner)
struct employee_view
{
  ...
};

Now, the above view will only return employees that are not unemployed (the employer_ pointer is not NULL). Supported join types are left, right, full, inner, and cross, though not all underlying databases support all types.