Archive for the ‘Design’ Category

Using C++11 auto and decltype

Tuesday, August 14th, 2012

I am sure by now you’ve heard of C++11 type deduction from initializer (aka auto). Probably also seen a few examples or, maybe, even wrote some code that uses it. There is also another similar, yet different, new feature in C++11: an operator that returns the declaration type of an expression (aka decltype). In this post I am going to discuss a few thoughts and guidelines on using these new features in C++ applications.

But first let’s quickly recap what auto and decltype are all about. C++11 auto instructs the compiler to automatically deduce the type of a variable from its initializer. Here is an example:

auto x = f ();

When I first saw a code fragment like this, two thoughts immediately crossed my mind: I now have no idea what the type of x is and This is a code readability and maintenance nightmare waiting to happen. But as I started learning more about auto’s semantics, I began to realize that perhaps my scepticism was not justified. To see why, let’s consider what type we get for a few different declarations of f(). First, if f() returns by value, things are pretty straightforward:

int f ();
auto x = f (); // x is of type int

Things get more interesting when f() returns a reference. Does x also become a reference or does it remain a value? The answer is, it remains a value:

int& f ();
auto x = f (); // x is of type int, not int&

This is probably the most important point to keep in mind when using auto: the type that it “substituted” for auto always has its top-level reference removed. When I understood this point, again, my first thought was: This is bizarre. Now, besides not knowing what we get, we also don’t get exactly the same type. My mind promptly envisioned all these cases where a function returns by reference but an unnecessary copy is made because auto stripped the reference. But, again, as I had more time to think about it, I realized that my fears are probably ungrounded. To understand why, think about the type of the local variable (x in our example) as having two parts: the core type (int in our case) and its const-ness/reference-ness. The core type can be naturally determined from the initializer expression. However, const/reference-ness is really determined by what we plan to do with the object further down within our code. Are we just accessing it? Then our variable should probably be a const reference. Are we planning to modify it? If so, then do we want to modify a shared object or our own copy? If it is shared, then our variable should be a reference. Otherwise, it should be a value. Here are the signatures for each case:

const auto& x = f (); // x is not modified
auto& x = f ();       // x is modified, shared
auto x = f ();        // x is modified, private

In a sense, by choosing to strip the top-level reference, auto forces us to specify our intentions. Plus, if we use the above signatures for each use-case, we get an additional safety net in case the type of an initializer changes. For example, if we are expecting to modify a shared reference and the signature of f() changes to return, say, by-value instead of by-reference, we will get a compile error.

If you have to stop reading right now and need a single takeaway from this post, then it will be this: whenever you find yourself writing auto x, stop and ask if you plan to modify x? If the answer is No, then change that to const auto& x.

Now that we understand auto, it is easy to define decltype. This operator evaluates to the exact declaration type of an expression, including references and all. Here is an example that contrasts the two:

int f1 ();
int& f2 ();
const int& f3 ();
auto a1 = f1 (); // a1 is int
auto a2 = f2 (); // a1 is int
auto a3 = f3 (); // a1 is int
decltype (f1 ()) d1 = f1 (); // d1 is int
decltype (f2 ()) d2 = f2 (); // d2 is int&
decltype (f3 ()) d3 = f3 (); // d3 is const int&

You may have noticed that the top-level const/reference stripping semantics of auto mimics that of automatic template argument deduction. In fact, in the standard, auto is defined in terms of template argument deduction. By now many people have developed a pretty good intuition about what the deduced template argument will be. We can easily extend this intuition to auto by mentally re-writing a statement like this:

const int& f ();
auto& x = f (); // auto -> const int, x is const int&

To something like this

template <typename auto>
void g (auto& x);
g (f ()); // auto -> const int, x is const int&

One interesting consequence of this equivalence is that auto also uses the special perfect forwarding deduction rules when we have just auto&&. Consider this example:

struct s {};
s        f1 ();
const s  f2 ();
      s& f3 ();
const s& f4 ();
auto&& r1 (f1 ()); //       s&&
auto&& r2 (f2 ()); // const s&&
auto&& r3 (f3 ()); //       s&
auto&& r4 (f4 ()); // const s&

While probably not very useful in ordinary code, this can be handy in generic code, if, for example, we need to forward an unknown return value to another function:

template <typename F1, typename F2, typename F3>
void compose (F1 f1, F2 f2, F3 f3)
  auto&& r = f1 ();
  f2 ();
  f3 (std::forward<decltype (f1 ())> (r));

C++ Event Logging with SQLite and ODB

Tuesday, August 7th, 2012

At some point most applications need some form of logging or tracing. Be it error logging, general execution tracing, or some application-specific event logging. The former two are quite common and a lot of libraries have been designed to support them. The latter one, however, normally requires a custom implementation for each application since the data that we wish to log, the media on which the logs are stored, and the operations that we need to perform on these logs vary greatly from case to case. In this post I am going to show how to implement event logging in C++ applications using the SQLite relational database and the ODB C++ ORM.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server. Pre-built packages are available for GNU/Linux, Windows, Mac OS X, and Solaris and were tested with GNU g++, MS Visual C++, Sun CC, and Clang.

When I said that we will be storing our events in a relational database, some of you might have thought I am crazy. But SQLite is no ordinary relational database. First of all, SQLite is not your typical client-server RDBMS. Rather, it is an extremely small and fast embedded database. That is, it is linked to your application as a library and the actual database is just a file on a disk. And, as a bonus, SQLite also supports in-memory databases.

Ok, maybe you are still not convinced that SQLite is a good fit for event logging. Consider then the benefits: With a relational database we can easily store statically-typed data in our event records. With SQLite we can store the log as a file on a disk, we can have an in-memory log, or some combination of the two (SQLite allows us to save an in-memory database to a file). And don’t forget that SQLite is fully ACID-compliant. This can be especially important for being able to audit the logs with any kind of certainty that they are accurate. Finally, when it comes to operations that we can perform on our logs, it is hard to beat the generality offered by an RDBMS for data querying and manipulation.

Maybe you are almost convinced, but there is just one last problem: who wants to deal with all that SQL from within a C++ application? Well, that’s where ODB comes to the rescue. As we will see shortly, we can take advantage of all these benefits offered by SQLite without having a single line of SQL in our code. Note also that while the below discussion is focused on SQLite, nothing prevents you from using another RDBMS instead. For example, if you needed to store your logs in a central network location, then you could instead use a client-server database such as PostgreSQL.

But enough talking. Let’s see the code. A good example of an application that usually requires logging is a game server. The log could contain, for example, information about when players entered and left the game. However, for this post I’ve decided to use something less typical: an automated stock trading system. Here we would like to keep an accurate log of all the buy/sell transactions performed by the system. This way we can later figure out how come we lost so much money.

Our event record, as a C++ class, could then look like this (I am omitting accessors and instead making all the data members public for brevity):

enum order_type {buy, sell};
class event
  event (order_type type,
         const std::string& security,
         unsigned int qty,
         double price);
  order_type order;
  std::string security;
  unsigned int qty;
  double price;
  boost::posix_time::ptime timestamp;
std::ostream& operator<< (std::ostream&, const event&);

The event class constructor initializes the first four data members with the passed values and sets timestamp to the current time. Here I choose to use Boost date-time library to get a convenient time representation. Other alternatives would be to use QDateTime from Qt or just have an unsigned long long value that stores time in microseconds.

Ok, we have the event class but how do we store it in an SQLite database? This is actually pretty easy with ODB. The only two modifications that we have to make to our class are to mark it as persistent and add the default constructor, which we can make private:

#pragma db object no_id
class event
  friend class odb::access;
  event () {}

The friend declaration grants ODB access to the private default constructor, and to the data members, in case we decide to make them private as well.

You may also be wondering what that no_id clause at the end of #pragma db means? This expression tells the ODB compiler that this persistent class has no object id. Normally, persistent classes will have one or more data members designated as an object id. This id uniquely identifies each instance of a class in the database. Under the hood object ids are mapped to primary keys. However, there are cases where a persistent class has no natural object id and creating an artificial one, for example, an auto-incrementing integer, doesn’t add anything other than performance and storage overheads. Many ORM implementations do not support objects without ids. ODB, on the other hand, is flexible enough to support this use-case.

Our persistent event class is ready; see the event.hxx file for the complete version if you would like to try the example yourself. The next step is to compile this header with the ODB compiler to generate C++ code that actually does the saving/loading of this class to/from the SQLite database. Here is the command line:

odb -d sqlite -s -q -p boost/date-time event.hxx

Let’s go through the options one-by-one: The -d (--database) option specifies the database we are using (SQLite in our case). The -s (--generate-schema) option instructs the ODB compiler to also generate the database schema, that is the SQL DDL statements that create tables, etc. By default, for SQLite, the schema is embedded into the generated C++ code. The -q (--generate-query) option tells the ODB compiler to generate query support code. Later we will see how to use queries to analyze our logs. However, if your application doesn’t perform any analysis or if this analysis is performed off-line by another application, then you can omit this option and save on some generated code. Finally, the -p (--profile) option tells the ODB compiler to include the boost/date-time profile. We need it because we are using boost::posix_time::ptime as a data member in our persistent class (see ODB Manual for more information on ODB profiles).

The above command will produce three new files: event-odb.hxx, event-odb.ixx, and event-odb.cxx. The first file we need to #include into our source code and the third file we need to compile and link to our application.

Now we are ready to log some events. Here is an example that opens an SQLite database, re-creates the schema, and then logs a couple of events. I’ve added a few calls to sleep() so that we can get some time gaps between records:

void log (odb::database& db, const event& e)
  transaction t (db.begin ());
  db.persist (e);
  t.commit ();
int main ()
  odb::sqlite::database db (
  // Create the database schema.
    transaction t (db.begin ());
    schema_catalog::create_schema (db);
    t.commit ();
  // Log some trading events.
  log (db, event (buy,  "INTC", 100, 25.10));
  log (db, event (buy,  "AMD",  200, 4.05));
  log (db, event (buy,  "ARM",  200, 26.25));
  sleep (1);
  log (db, event (sell, "AMD",  50,  5.45));
  log (db, event (buy,  "ARM",  300, 25.35));
  log (db, event (sell, "INTC", 100, 24.45));
  sleep (1);
  log (db, event (sell, "AMD",  100, 4.45));
  log (db, event (sell, "ARM",  150, 27.75));

We are now ready to build and run our example (see driver.cxx for the complete version). Here is the command line I used on my GNU/Linux box:

g++ -o driver driver.cxx event-odb.cxx -lodb-boost \
  -lodb-sqlite -lodb -lboost_date_time

If we now run this example, it takes a few seconds (remember the sleep() calls) and finishes without any indication of whether it was successful or not. In a moment we will add some querying code to our example which will print returned log records. In the meantime, however, we can fire up the sqlite3 utility, which is a command line SQL client for SQLite, and examine our database:

$ sqlite3 test.db
sqlite> select * from event;
0|INTC|100|25.1|2012-07-29 15:56:39.317903
0|AMD|200|4.05|2012-07-29 15:56:39.318812
0|ARM|200|26.25|2012-07-29 15:56:39.319459
1|AMD|50|5.45|2012-07-29 15:56:40.320095
0|ARM|300|25.35|2012-07-29 15:56:40.321089
1|INTC|100|24.45|2012-07-29 15:56:40.321703
1|AMD|100|4.45|2012-07-29 15:56:41.322475
1|ARM|150|27.75|2012-07-29 15:56:41.323717

As you can see, the database seems to contain all the records that we have logged.

Ok, logging was quite easy. Let’s now see how we can run various queries on the logged records. To start, a simple one: find all the transactions involving AMD:

typedef odb::query<event> query;
typedef odb::result<event> result;
transaction t (db.begin ());
result r (db.query<event> (query::security == "AMD"));
for (result::iterator i (r.begin ()); i != r.end (); ++i)
  cerr << *i << endl;
t.commit ();

Or, a much more elegant C++11 version of the same transaction:

transaction t (db.begin ());
for (auto& e: db.query<event> (query::security == "AMD"))
  cerr << e << endl;
t.commit ();

If we add this fragment at the end of our driver and run it, we will see the following output:

2012-Jul-29 16:03:05.585545 buy 200 AMD @ 4.05
2012-Jul-29 16:03:06.587784 sell 50 AMD @ 5.45
2012-Jul-29 16:03:07.590787 sell 100 AMD @ 4.45

What if we were only interested in transactions that happened in, say, the last 2 seconds. That’s also easy:

ptime tm (microsec_clock::local_time () - seconds (2));
result r (db.query<event> (query::security == "AMD" &&
                           query::timestamp >= tm));
for (result::iterator i (r.begin ()); i != r.end (); ++i)
  cerr << *i << endl;

Again, if we add this code to the driver and run it, the output will change to:

2012-Jul-29 16:03:06.587784 sell 50 AMD @ 5.45
2012-Jul-29 16:03:07.590787 sell 100 AMD @ 4.45

I think you get the picture: with ODB we can slice and dice the log in any way we please. We can even select a subset of the record members or run aggregate queries thanks to an ODB feature called views.

Another operation that we may want to perform from time to time is to trim the log. The following code will delete all the records that are older than 2 seconds:

ptime tm (microsec_clock::local_time () - seconds (2));
db.erase_query<event> (query::timestamp < tm);

What about multi-threaded applications? Can we log to the same database from multiple threads simultaneously. The answer is yes. While it takes some effort to make an SQLite database usable from multiple threads, ODB takes care of that for us. With the release of SQLite 3.7.13, now we can even share an in-memory database among multiple threads (see this note for details).

As you can see, put together, SQLite and ODB can make a pretty powerful yet very simple to setup and use event logging system. It is fast, small, multi-threaded, and ACID compliant. For more information on ODB, refer to the ODB project page or jump directly to the ODB Manual.

Extended Database to C++ Type Mapping in ODB

Wednesday, July 18th, 2012

When it comes to development tools, I view features that they provide as being of two kinds. The majority are of the first kind which simply do something useful for the user of the tool. But the ones I really like are features that help people help themselves in ways that I might not have foreseen. The upcoming ODB 2.1.0 release has just such a feature.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code. ODB natively supports SQLite, PostgreSQL, MySQL, Oracle, and Microsoft SQL Server.

To understand this new feature let’s first get some background on the problem. As you probably know, these days all relational databases support pretty much the same set of “core” SQL data types. Things like integers, floating point types, strings, binary, date-time, etc. Each database, of course, has its own names for these types, but they provide more or less the same functionality across all the vendors. For each database ODB provides native support for all the core SQL types. Here by native I mean that the data is exchanged with the database in the most efficient, binary format. ODB also allows you to map any core SQL type to any C++ type so we can map TEXT to std::string, QString, or my_string (the former two mappings are provided by default).

This all sounds nice and simple and that would have been the end of the story if all that modern databases supported were core SQL types. However, most modern databases also support a slew of extended SQL types. Things like spatial types, user-defined types, arrays, XML, the kitchen sink, etc, etc (Ok, I don’t think any database supports that last one, yet). Here is a by no means complete list that should give you an idea about the vast and varying set of extended types available in each database supported by ODB:

  • Spatial types (GEOMETRY, GEOGRAPHY)
  • Spatial types (GEOMETRY, GEOGRAPHY) [spatialite extension]
  • XML
  • JSON
  • HSTORE (key-value store) [hstore extension]
  • Geometric types
  • Network address types
  • Enumerated types
  • Arrays
  • Range types
  • Composite types
  • Spatial types (GEOMETRY, GEOGRAPHY) [PostGIS extension]
  • ANY
  • XML
  • Arrays (VARRAY, table type)
  • User-defined types
  • Spatial types (GEOMETRY, GEOGRAPHY)
SQL Server
  • XML
  • Alias types
  • CLR types
  • Spatial types (GEOMETRY, GEOGRAPHY)

When people just started using ODB, core SQL types were sufficient. But now, as projects become more ambitious, we started getting questions about using extended SQL types in ODB. For example, ODB will handle std::vector<int> for us, but it will do it in a portable manner, which means it will create a separate, JOIN‘ed table to store the vector elements. On the other hand, if we are using PostgreSQL, it would be much cleaner to map it to a single column of the array of integers type (INTEGER[]). Clearly we needed a way to support extended SQL types in ODB.

The straightforward way to add this support would have been to handle extended types the same way we handle the core ones. That is, for each type implement a mapping that uses native database format. However, as types become more complex (e.g., arrays, user-defined types) so do the methods used to access them in the database-native format. In fact, for some databases, this format is not even documented and the only way to understand how things are represented is to study the database source code!

So the straightforward way appears to be very laborious and not very robust. What other options do we have? The idea that is implemented in ODB came from the way the OpenGIS specification handles reading and writing of spatial values (GEOMETRY, GEOGRAPHY). OpenGIS specifies the Well-Known Text (WKT) and Well-Known Binary (WKB) formats for representing spatial values. For example, point(10, 20) in WKT is represented as the "POINT(10 20)" string. Essentially, what OpenGIS did is define an interface for the spatial SQL types in terms of one of the core SQL types (text or binary). OpenGIS also defines a pair of functions for converting between, say, WKT and GEOMETRY values (GeomFromText/AsText).

As it turns out, this idea of interfacing with an extended SQL type using one of the core ones can be used to handle pretty much any extended type mentioned in the list above. In the vast majority of cases all we need to do is cast one value to another.

So in order to support extended SQL types, ODB allows us to map them to one of the built-in types, normally a string or a binary. Given the text or binary representation of the data we can then extract it into our chosen C++ data type and thus establish a mapping between an extended database type and its C++ equivalent.

The mapping between an extended type and a core SQL type is established with the map pragma:

#pragma db map type(regex) as(subst) to(subst) from(subst)

The type clause specifies the name of the database type that we are mapping, which we will call mapped type from now on. The as clause specifies the name of the database type that we are mapping the mapped type to. We will call it interface type from now on. The optional to and from clauses specify the database conversion expressions between the mapped type and the interface type. They must contain the special (?) placeholder which will be replaced with the actual value to be converted.

The name of the mapped type is actually a regular expression pattern so we can match a class of types, instead of just a single name. We will see how this can be useful in a moment. Similarly, the name of the interface type as well as the to/from conversion expressions are actually regex pattern substitutions.

Let’s now look at a concrete example that shows how all this fits together. Earlier I mentioned std::vector<int> and how it would be nice to map it to PostgreSQL INTEGER[] instead of creating a separate table. Let’s see what it takes to arrange such a mapping.

In PostgreSQL the array literal has the {n1,n2,...} form. As it turns out, if we cast an array to TEXT, then we will get a string in exactly this format. Similarly, Postgres is happy to convert a string in this form back to an array with a simple cast. With this knowledge, we can take a stab at the mapping pragma:

#pragma db map type("INTEGER\\[\\]") \
               as("TEXT")            \
               to("(?)::INTEGER[]")  \

In plain English this pragma essentially says this: map INTEGER[] to TEXT. To convert from TEXT to INTEGER[], cast the value to INTEGER[]. To convert the other way, cast the value to TEXT. exp::TEXT is a shorter, Postgres-specific notation for CAST(exp AS TEXT).

The above pragma will do the trick if we always spell the type as INTEGER[]. However, INTEGER [] or INTEGER[123] are also valid. If we want to handle all the one-dimension arrays of integers, then that regex support I mentioned above comes in very handy:

#pragma db map type("INTEGER *\\[(\\d*)\\]") \
               as("TEXT")                    \
               to("(?)::INTEGER[$1]")        \

With the above pragma we can now have a persistent class that contains std::vector<int> mapped to INTEGER[]:

// test.hxx
#ifndef TEST_HXX
#define TEST_HXX
#include <vector>
#pragma db map type("INTEGER *\\[(\\d*)\\]") \
               as("TEXT")                    \
               to("(?)::INTEGER[$1]")        \
#pragma db object
class object
  #pragma db id auto
  unsigned long id;
  #pragma db type("INTEGER[]")
  std::vector<int> array;

Ok, that’s one half of the puzzle. The other half is to implement conversion between std::vector<int> and the "{n1,n2,...}" text representation. For that we need to provide a value_traits specialization for std::vector<int> C++ type and TEXT PostgreSQL type. value_traits is the ODB customization mechanism I mentioned earlier that allows us to map any C++ type to any core SQL type. Here is a sample implementation which should be pretty easy to follow. I’ve instrumented it with a few print statements so that we can see what’s going on at runtime.

// traits.hxx
#ifndef TRAITS_HXX
#define TRAITS_HXX
#include <vector>
#include <sstream>
#include <iostream>
#include <cstring> // std::memcpy
#include <odb/pgsql/traits.hxx>
namespace odb
  namespace pgsql
    template <>
    class value_traits<std::vector<int>, id_string>
      typedef std::vector<int> value_type;
      typedef value_type query_type;
      typedef details::buffer image_type;
      static void
      set_value (value_type& v,
                 const details::buffer& b,
                 std::size_t n,
                 bool is_null)
        v.clear ();
        if (!is_null)
          char c;
          std::string s ( (), n);
          std::cerr << "in: " << s << std::endl;
          std::istringstream is (s);
          is >> c; // '{'
          for (c = static_cast<char> (is.peek ());
               c != '}';
               is >> c)
            v.push_back (int ());
            is >> v.back ();
      static void
      set_image (details::buffer& b,
                 std::size_t& n,
                 bool& is_null,
                 const value_type& v)
        is_null = false;
        std::ostringstream os;
        os << '{';
        for (value_type::const_iterator i (v.begin ()),
               e (v.end ());
             i != e;)
          os << *i;
          if (++i != e)
            os << ',';
        os << '}';
        const std::string& s (os.str ());
        std::cerr << "out: " << s << std::endl;
        n = s.size ();
        if (n > b.capacity ())
          b.capacity (n);
        std::memcpy ( (), s.c_str (), n);

Ok, now that we have both pieces of the puzzle, let’s put everything together. The first step is to compile test.hxx (the file that defines the persistent class) with the ODB compiler. At this stage we need to include traits.hxx (the file that defines the value_traits specialization) into the generated header file. We use the --hxx-epilogue option for that. Here is a sample ODB command line:

odb -d pgsql -s --hxx-epilogue '#include "traits.hxx"' test.hxx

Let’s also create a test driver that stores the object in the database and then loads it back. Here we want to see two things: the SQL statements that are being executed and the data that is being sent to and from the database:

// driver.cxx
#include <odb/transaction.hxx>
#include <odb/pgsql/database.hxx>
#include "test.hxx"
#include "test-odb.hxx"
using namespace std;
using namespace odb::core;
int main ()
  odb::pgsql::database db ("odb_test", "", "odb_test");
  object o;
  o.array.push_back (1);
  o.array.push_back (2);
  o.array.push_back (3);
  transaction t (db.begin ());
  t.tracer (stderr_tracer);
  unsigned long id (db.persist (o));
  db.load (id, o);
  t.commit ();

Now we can build and run our test driver:

g++ -o driver driver.cxx test-odb.cxx -lodb-pgsql -lodb
psql -U odb_test -d odb_test ./test.sql

The output of the test driver is shown below. Notice how the conversion expressions that we specified in the mapping pragma ended up in the SQL statements that ODB executed in order to persist and load the object.

out: {1,2,3}
SELECT,object.array::TEXT FROM object WHERE$1
in: {1,2,3}

For more information on custom database type mapping support in ODB refer to Section 12.6, “Database Type Mapping Pragmas” in the ODB manual. Additionally, the odb-tests package contains a set of tests in the <database>/custom directories that, for each database, shows how to provide custom mapping for a variety of SQL types.

While the 2.1.0 release is still several weeks out, if you would like to give the new type mapping support a try, you can use the 2.1.0.a1 pre-release.