Archive for December, 2011

OCI and MinGW

Friday, December 9th, 2011

When we started working on ODB there were lots of questions about how we were going to support each database. Should we use one of the “common denominator” APIs such as ODBC? A higher-level C++ wrapper for each database? Or a low-level, native C API that all the other APIs are based on? In the end we decided to go with what at the time seemed like the most painful way — to use the native C APIs. Yes, that meant we had to write more code and work with hairy interfaces (if you dealt with OCI (Oracle Call Interface), you know what I am talking about here). It also meant that support for each database would take longer to implement. But it also meant we were in complete control and could take advantage of database-specific features to make sure support for each database is as good as it can possibly be. It also meant that the resulting code would be faster (no wrapper overhead), smaller (no unnecessary dependencies), and of better qualify (no third-party bugs).

Two years later and I keep getting confirmation that this was the right decision. Just the other day I built ODB Oracle runtime, which is based on OCI, with MinGW. Does Oracle provide an OCI library build for MinGW? Of course, not! But because OCI is a C library, we can take the “official” OCI import library for VC++, oci.lib, rename it to libclntsh.a, and we’ve got OCI for MinGW.

Would we have been able to use ODB with Oracle on MinGW had we chosen to use something like OCCI (Oracle C++ wrapper for OCI)? No we wouldn’t have — while we can use a C library built using VC++ with MinGW, the same won’t work for a C++ library. In fact, this doesn’t even work between different versions of VC++. This is why Oracle has to ship multiple versions of occi.lib but only one oci.lib. Sometimes depending on only the basics is really the right way to go.

ODB 1.7.0 released

Wednesday, December 7th, 2011

ODB 1.7.0 was released today.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, and manually writing any of the mapping code.

For the complete list of changes, see the official ODB 1.7.0 announcement. While there are quite a few cool new features in this release, the biggest one, no doubt, is support for the Oracle database. As usual, below I am going to examine this and other notable new features in more detail.

Oracle support

Support for Oracle is provided by the libodb-oracle runtime library. All the standard ODB functionality is available to you when using Oracle, including support for containers, object relationships, queries, date-time types in the Boost and Qt profiles, etc. In other words, this is complete, first-class support, similar to that provided for PostgreSQL, MySQL, and SQLite. There are a few limitations, however, most of which are imposed by the underlying OCI API. Those are discussed in Chapter 16, “Oracle Database” in the ODB Manual.

For connection management in Oracle, ODB provides two standard connection factories (you can also provide your own if so desired): new_conection_factory, and conection_pool_factory.

The new connection factory creates a new connection whenever one is requested. Once the connection is no longer needed, it is closed.

The connection pool factory maintains a pool of connections and you can specify the min and max connection counts for each pool created. This factory is the default choice when creating a database instance.

If you had any prior experience with ODB, you are probably aware that one of our primary goals is high performance and low overhead. For that we use native database APIs and all the available performance-enhancing features (e.g., prepared statements). We also cache connections, statements, and even memory buffers extensively. The Oracle runtime is no exception in this regard. The question you are probably asking now is how does it stack up, performance-wise, against other databases that we support.

Unfortunately, Oracle Corporation doesn’t allow anyone to publish any hard performance numbers about its database. This is explicitly prohibited in the license, even for Oracle Express. So all I can do in this situation is give to you some general indications and make it easy to run a few benchmarks for yourself.

The first benchmark that we normally run is the one from the Performance of ODB vs C# ORMs post. Essentially we are measuring how fast we can load an object with a couple of dozen members from the database. In the previous announcements I mentioned that it takes ODB with PostgreSQL 9.0.4 27ms per 500 iterations (54μs per object), MySQL 5.1.49 — 24ms (48μs per object), and SQLite 3.7.5 — 7ms (14μs per object). Oracle 11.2 results are on par with PostgreSQL and MySQL. To get the exact numbers, feel free to download the benchmark source code and give it try. The accompanying README file has more information on how to build and run the test.

To measure the concurrent access performance we use an update-heavy, highly-contentious multi-threaded test in the ODB test suite, the kind you run to make sure things work properly in multi-threaded applications (see odb-tests/common/threads if you are interested in details). It normally pushes my 2-CPU, 8-core Xeon E5520 machine, which runs the database server, close to 100% CPU utilization. As you may remember, PostgreSQL 9.0.4 was the star of this benchmark, beating both MySQL 5.1.49 with the InnoDB backend and SQLite 3.7.5 by a significant margin (12s vs 186s and 48s, respectively). Again, the licensing terms prevent me from revealing any concrete numbers. But let me say that Oracle performance in this test is commensurable with the amount of money one has to pay for it. In particular, the higher the edition of Oracle you are using and thus the more CPUs it can use, the better the performance. Interestingly, even the free Express edition, which is limited to 1 CPU and 1GB or RAM, outperforms MySQL that has 8 cores and 12GB of RAM available to it.

Let me also note that these numbers should be taken as indications only. It is futile to try to extrapolate some benchmark results to your application when it comes to databases. The only reliable approach is to create a custom test that mimics your application’s data, concurrency, and access patterns. Luckily, with ODB, creating such a test is a very easy job. You can use the above-mentioned benchmark source code as a starting point.

Optimistic concurrency

Another big feature in this release is support for optimistic concurrency. To make a persistent class “optimistic”, all we have to do is declare it as such and add an integer data member that will store the object version. For example:

#pragma db object optimistic
class person
{
  ...
 
  #pragma db version
  unsigned long version_;
};

Now, whenever we update the state of the person object in the database, ODB will check if it has in the meantime been modified by someone else and throw the odb::object_changed exception if that’s the case. Chapter 11, “Optimistic Concurrency” in the ODB Manual has more information as well as a nice overview of the optimistic concurrency concept, if you are not familiar with the idea.

SQL statement execution tracing

Quite a few ODB users asked us to provide a way to trace SQL statements that are being executed as a result of database operations. While the database can normally provide a log of SQL statements, being able to do this from the application has a number of advantages. In particular, this way we can trace SQL statements only for a specific transaction, or even a specific set of ODB operations.

ODB allows us to specify a tracer on the database, connection, and transaction levels. We can either provide our own implementation of the tracer interface (odb::tracer), or we can pass the built-in odb::stderr_tracer implementation that prints the statements to STDERR. This example shows how we can trace all the statements executed on a specific database:

odb::database& db = ...;
db.tracer (odb::stderr_tracer);
 
...

Alternatively, we can trace only a specific transaction:

transaction t (db.begin ());
t.tracer (stderr_tracer);
 
...
 
t.commit ();

Or even a single database operation:

transaction t (db.begin ());
 
...
 
t.tracer (stderr_tracer);
db.update (obj);
t.tracer (0);
 
...
 
t.commit ();

For more information on the new tracing support, see Section 3.12 “Tracing SQL Statement Execution” in the ODB Manual.

Read-only/const data members

ODB now allows you to mark data members as being read-only. Changes to such data members are ignored when updating the database state of an object. For example:

#pragma db object
class person
{
  ...
 
  #pragma db readonly
  std::string name_;
};

ODB automatically treats const data members as read-only. It is also possible to declare the whole persistent class as read-only.

Read-only members are primarily useful when dealing with asynchronous changes to the state of a data member in the database which should not be overwritten. In other cases, where the state of a data member never changes, declaring such a member read-only allows ODB to perform more efficient object updates. In such cases, however, it is conceptually more correct to declare such a data member as const rather than as read-only. So, the above example is probably better re-written like this:

#pragma db object
class person
{
  ...
 
  const std::string name_;
};

Persistent classes without object ids

Up until this release every persistent class had to have an object id (which translates to PRIMARY KEY in the database). But as it is sometimes desirable to have a table without the primary key, so it is sometimes desirable to have an object without an object id. With this release ODB adds support for persistent classes without object identifiers. Note, however, that such classes have to be explicitly declared as not having an object id:

#pragma db object id()
class person
{
  ...
};

Such classes also have limited functionality. In particular, they cannot be loaded with the database::load() or database::find() functions, updated with the database::update() function, or deleted with the database::erase() function. To load and delete objects without ids you can use the database::query() and database::erase_query() functions, respectively. There is no way to update such objects except by using native SQL statements.

Microsoft SQL Server ODBC driver for Linux

Friday, December 2nd, 2011

We mainly develop ODB on GNU/Linux and then regularly test it on other platforms. This posed an interesting challenge once we started working on support for Microsoft SQL Server. The recommended way to access SQL Server from native applications is using the SQL Server Native Client ODBC driver. The problem is (or rather was, as you will see shortly) that Native Client is only available on Windows. In our case this meant that while we could still build everything on Linux (using a MinGW cross-compiler), to actually run the tests we would have to copy everything over to a Windows box. And that would be a major inconvenience compared to running tests directly from Emacs, which is what I am used to.

Doing a few web searches didn’t yield anything useful. There is the ODBC driver that is part of the FreeTDS project but it has limited functionality (for example, it doesn’t support the Multiple Active Result Sets (MARS) feature). Then there are a number of commercial offerings with convoluted licensing models and restrictions. But the main problem with all these alternatives is that we are not really interested in testing any of these drivers. For now we are only interested in making sure that ODB works well with the Microsoft ODBC driver, since that’s what 99% of the users will use anyway.

So what we really need is the Microsoft Native Client ODBC driver for Linux. Now you may be thinking, yeah, dream on, Microsoft will never release anything like this. Well, you may be surprised, but Microsoft did exactly that. About a month ago they pre-announced a Linux driver and a preview version was made available as part of the SQL Server 2012 RC0 release. You can also browse the driver documentation online. We have been running some preliminary ODB tests with this driver and so far it has been working really well.

While this preview release of the driver is only officially supported on 64-bit RedHat EL 5, it is not too difficult to install it on 64-bit Debian or Ubuntu. Below are the instructions.

Installing SQL Server ODBC driver on Debian/Ubuntu

The first step in installing the driver is to make sure you have unixODBC 2.3.0 driver manager installed. At the time of writing, the latest version of the unixodbc package available from the Debian/Ubuntu repositories was 2.2.14. That meant I had to build and install the driver manager from sources. I didn’t try to use the install script that comes with the Microsoft driver, opting to use a modified version of their Manual Installation steps:

  1. First make sure that any older version of the unixODBC that you may have installed is removed:
    $ apt-get remove libodbc1 unixodbc unixodbc-dev
    
  2. Download and unpack unixODBC-2.3.0.tar.gz (see an update below on using unixODBC-2.3.1 instead).
  3. While the Microsoft instructions show how to install unixODBC to /usr, I like to keep custom-build software in /usr/local and installing unixODBC to this directory works just as well:
    $ ./configure --disable-gui --disable-drivers 
    --enable-iconv --with-iconv-char-enc=UTF8 
    --with-iconv-ucode-enc=UTF16LE
    $ make
    $ sudo make install
    

The next step is to install the driver. But before we run the installation script that comes with the package, let’s make sure we have all the dependencies. For that, first download and unpack the driver archive. Inside, in the lib64/ directory, you will find the libsqlncli-11.0.so.1720.0 file. This is the actual driver. Let’s run ldd on it to see if there are any missing dependencies:

$ ldd libsqlncli-11.0.so.1720.0

Look for lines that have “not found” in them. They indicate missing dependencies. When I first ran this command on my Debian box, I got the following output:

ldd ./libsqlncli-11.0.so.1720.0
  libcrypto.so.6 => not found
  libodbc.so.1 => /usr/local/lib/libodbc.so.1
  libssl.so.6 => not found
  libuuid.so.1 => /lib/libuuid.so.1
  libodbcinst.so.1 => /usr/local/lib/libodbcinst.so.1
  libkrb5.so.3 => /usr/lib/libkrb5.so.3
  libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2
  libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
  libltdl.so.7 => /usr/lib/libltdl.so.7
  libk5crypto.so.3 => /usr/lib/libk5crypto.so.3
  libkrb5support.so.0 => /usr/lib/libkrb5support.so.0
  libkeyutils.so.1 => /lib/libkeyutils.so.1
  ...

Which indicated that I had libcrypto.so.6 and libssl.so.6 missing. As a general approach to resolving missing dependencies, you can enter the library name in the Debian package search or Ubuntu package search (use the “Search the contents of packages” section) and then install the package that contains the missing library.

However, if you try to do this for libcrypto.so.6 or libssl.so.6, you won’t find any packages. The reason for this is the different versioning schemes used for these libraries in RedHat EL and Debian/Ubuntu. In Debian/Ubuntu the equivalent libraries are called libcrypto.so.0.9.8 and libssl.so.0.9.8 and are part of the libssl0.9.8 package. So to resolve these dependencies, first make sure that the libssl0.9.8 package is installed and then create the libcrypto.so.6 and libssl.so.6 symbolic links:

$ cd /usr/lib
$ sudo ln -s libssl.so.0.9.8 libssl.so.6
$ sudo ln -s libcrypto.so.0.9.8 libcrypto.so.6

Also note that if you have “not found” next to libodbc.so.1 (the unixODBC driver manager we have just installed), then this most likely means that /usr/local/lib is not in your dynamic linker search path. If that’s the case, add it to the /etc/ld.so.conf and don’t forget to reload the cache by running ldconfig as root.

Once all the dependencies are met, we can finally run the script to install the driver. We have to use the --force option to ignore some of the compatibility tests performed by the script:

$ sudo bash ./install.sh install --force

To test the installation you can try to connect to the local host using sqlcmd:

$ sqlcmd -S localhost

Unless you are running the Linux edition of SQL Server (wink wink) you should get an error message indicating that a network connection could not be established. Any other error, such as inability to load a shared library, most likely indicates a missing dependency or a configuration error.

Update: After performing additional tests with ODB we have discovered that unixODBC-2.3.0 doesn’t work very well in multithreaded applications and applications that create more than one ODBC connection. However, the recently released unixODBC-2.3.1 appears to have addressed this issue. With this version all the ODB tests work on Linux just as well as on Windows. The following instructions explain how to make the Native Client ODBC driver for Linux work with unixODBC 2.3.1 instead of 2.3.0.

With the release of version 2.3.1 the unixODBC project changed the shared libraries version. This causes a problem when we try to use this version of unixODBC with Native Client because the driver is linked with the previous version. There are two ways to address this problem, as discussed below.

The easiest approach is to change the shared libraries version back to the old value in the unixODBC source distribution. Using the original instructions, after unpacking unixODBC-2.3.1 (instead of 2.3.0), open the configure file in a text editor and search for the LIB_VERSION= string. Then change it from reading:

LIB_VERSION="2:0:0"

To read:

LIB_VERSION="1:0:0"

Then follow the remainder of the original instructions without any further modifications.

The alternative approach is a bit more involved but it doesn’t require changing the shared libraries version. This, for example, can be preferable if you are installing unixODBC-2.3.1 from a binary package instead of building it yourself.

With this approach we install unixODBC-2.3.1 just like unixODBC-2.3.0, as described in the original instructions. Once this is done, the next step is to create a directory which will contain the “compatibility” symbolic links for the libraries. This can be any directory as long as it is not in the /etc/ld.so.conf list. The last part is important: if this directory is in ld.so.conf, things won’t work since ldconfig checks the library version embedded in the library and will ignore files that have version mismatches. This is why we cannot create the “compatibility” symlinks in, say, /usr/local/lib. However, /usr/local/lib/odbc-compat will work just fine:

$ sudo mkdir /usr/local/lib/odbc-compat

Once the directory is created, we add the following symbolic links:

$ cd /usr/local/lib/odbc-compat
$ sudo ln -s /usr/local/lib/libodbc.so.2 libodbc.so.1
$ sudo ln -s /usr/local/lib/libodbccr.so.2 libodbccr.so.1
$ sudo ln -s /usr/local/lib/libodbcinst.so.2 libodbcinst.so.1

The last step is to add the new directory to the
LD_LIBRARY_PATH environment variable (remember we cannot use the ld.so.conf mechanism):

export LD_LIBRARY_PATH=/usr/local/lib/odbc-compat:$LD_LIBRARY_PATH

If you want this path to be automatically added for your login, then you can add the above line to your ~/.bash_login file. If you want this to be system-wide, then instead add it to /etc/profile.

Once all this is done you can follow the remainder of the original instructions without any further modifications.