Archive for the ‘C++’ Category

Microsoft SQL Server ODBC driver for Linux

Friday, December 2nd, 2011

We mainly develop ODB on GNU/Linux and then regularly test it on other platforms. This posed an interesting challenge once we started working on support for Microsoft SQL Server. The recommended way to access SQL Server from native applications is using the SQL Server Native Client ODBC driver. The problem is (or rather was, as you will see shortly) that Native Client is only available on Windows. In our case this meant that while we could still build everything on Linux (using a MinGW cross-compiler), to actually run the tests we would have to copy everything over to a Windows box. And that would be a major inconvenience compared to running tests directly from Emacs, which is what I am used to.

Doing a few web searches didn’t yield anything useful. There is the ODBC driver that is part of the FreeTDS project but it has limited functionality (for example, it doesn’t support the Multiple Active Result Sets (MARS) feature). Then there are a number of commercial offerings with convoluted licensing models and restrictions. But the main problem with all these alternatives is that we are not really interested in testing any of these drivers. For now we are only interested in making sure that ODB works well with the Microsoft ODBC driver, since that’s what 99% of the users will use anyway.

So what we really need is the Microsoft Native Client ODBC driver for Linux. Now you may be thinking, yeah, dream on, Microsoft will never release anything like this. Well, you may be surprised, but Microsoft did exactly that. About a month ago they pre-announced a Linux driver and a preview version was made available as part of the SQL Server 2012 RC0 release. You can also browse the driver documentation online. We have been running some preliminary ODB tests with this driver and so far it has been working really well.

While this preview release of the driver is only officially supported on 64-bit RedHat EL 5, it is not too difficult to install it on 64-bit Debian or Ubuntu. Below are the instructions.

Installing SQL Server ODBC driver on Debian/Ubuntu

The first step in installing the driver is to make sure you have unixODBC 2.3.0 driver manager installed. At the time of writing, the latest version of the unixodbc package available from the Debian/Ubuntu repositories was 2.2.14. That meant I had to build and install the driver manager from sources. I didn’t try to use the install script that comes with the Microsoft driver, opting to use a modified version of their Manual Installation steps:

  1. First make sure that any older version of the unixODBC that you may have installed is removed:
    $ apt-get remove libodbc1 unixodbc unixodbc-dev
    
  2. Download and unpack unixODBC-2.3.0.tar.gz (see an update below on using unixODBC-2.3.1 instead).
  3. While the Microsoft instructions show how to install unixODBC to /usr, I like to keep custom-build software in /usr/local and installing unixODBC to this directory works just as well:
    $ ./configure --disable-gui --disable-drivers 
    --enable-iconv --with-iconv-char-enc=UTF8 
    --with-iconv-ucode-enc=UTF16LE
    $ make
    $ sudo make install
    

The next step is to install the driver. But before we run the installation script that comes with the package, let’s make sure we have all the dependencies. For that, first download and unpack the driver archive. Inside, in the lib64/ directory, you will find the libsqlncli-11.0.so.1720.0 file. This is the actual driver. Let’s run ldd on it to see if there are any missing dependencies:

$ ldd libsqlncli-11.0.so.1720.0

Look for lines that have “not found” in them. They indicate missing dependencies. When I first ran this command on my Debian box, I got the following output:

ldd ./libsqlncli-11.0.so.1720.0
  libcrypto.so.6 => not found
  libodbc.so.1 => /usr/local/lib/libodbc.so.1
  libssl.so.6 => not found
  libuuid.so.1 => /lib/libuuid.so.1
  libodbcinst.so.1 => /usr/local/lib/libodbcinst.so.1
  libkrb5.so.3 => /usr/lib/libkrb5.so.3
  libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2
  libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6
  libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1
  libltdl.so.7 => /usr/lib/libltdl.so.7
  libk5crypto.so.3 => /usr/lib/libk5crypto.so.3
  libkrb5support.so.0 => /usr/lib/libkrb5support.so.0
  libkeyutils.so.1 => /lib/libkeyutils.so.1
  ...

Which indicated that I had libcrypto.so.6 and libssl.so.6 missing. As a general approach to resolving missing dependencies, you can enter the library name in the Debian package search or Ubuntu package search (use the “Search the contents of packages” section) and then install the package that contains the missing library.

However, if you try to do this for libcrypto.so.6 or libssl.so.6, you won’t find any packages. The reason for this is the different versioning schemes used for these libraries in RedHat EL and Debian/Ubuntu. In Debian/Ubuntu the equivalent libraries are called libcrypto.so.0.9.8 and libssl.so.0.9.8 and are part of the libssl0.9.8 package. So to resolve these dependencies, first make sure that the libssl0.9.8 package is installed and then create the libcrypto.so.6 and libssl.so.6 symbolic links:

$ cd /usr/lib
$ sudo ln -s libssl.so.0.9.8 libssl.so.6
$ sudo ln -s libcrypto.so.0.9.8 libcrypto.so.6

Also note that if you have “not found” next to libodbc.so.1 (the unixODBC driver manager we have just installed), then this most likely means that /usr/local/lib is not in your dynamic linker search path. If that’s the case, add it to the /etc/ld.so.conf and don’t forget to reload the cache by running ldconfig as root.

Once all the dependencies are met, we can finally run the script to install the driver. We have to use the --force option to ignore some of the compatibility tests performed by the script:

$ sudo bash ./install.sh install --force

To test the installation you can try to connect to the local host using sqlcmd:

$ sqlcmd -S localhost

Unless you are running the Linux edition of SQL Server (wink wink) you should get an error message indicating that a network connection could not be established. Any other error, such as inability to load a shared library, most likely indicates a missing dependency or a configuration error.

Update: After performing additional tests with ODB we have discovered that unixODBC-2.3.0 doesn’t work very well in multithreaded applications and applications that create more than one ODBC connection. However, the recently released unixODBC-2.3.1 appears to have addressed this issue. With this version all the ODB tests work on Linux just as well as on Windows. The following instructions explain how to make the Native Client ODBC driver for Linux work with unixODBC 2.3.1 instead of 2.3.0.

With the release of version 2.3.1 the unixODBC project changed the shared libraries version. This causes a problem when we try to use this version of unixODBC with Native Client because the driver is linked with the previous version. There are two ways to address this problem, as discussed below.

The easiest approach is to change the shared libraries version back to the old value in the unixODBC source distribution. Using the original instructions, after unpacking unixODBC-2.3.1 (instead of 2.3.0), open the configure file in a text editor and search for the LIB_VERSION= string. Then change it from reading:

LIB_VERSION="2:0:0"

To read:

LIB_VERSION="1:0:0"

Then follow the remainder of the original instructions without any further modifications.

The alternative approach is a bit more involved but it doesn’t require changing the shared libraries version. This, for example, can be preferable if you are installing unixODBC-2.3.1 from a binary package instead of building it yourself.

With this approach we install unixODBC-2.3.1 just like unixODBC-2.3.0, as described in the original instructions. Once this is done, the next step is to create a directory which will contain the “compatibility” symbolic links for the libraries. This can be any directory as long as it is not in the /etc/ld.so.conf list. The last part is important: if this directory is in ld.so.conf, things won’t work since ldconfig checks the library version embedded in the library and will ignore files that have version mismatches. This is why we cannot create the “compatibility” symlinks in, say, /usr/local/lib. However, /usr/local/lib/odbc-compat will work just fine:

$ sudo mkdir /usr/local/lib/odbc-compat

Once the directory is created, we add the following symbolic links:

$ cd /usr/local/lib/odbc-compat
$ sudo ln -s /usr/local/lib/libodbc.so.2 libodbc.so.1
$ sudo ln -s /usr/local/lib/libodbccr.so.2 libodbccr.so.1
$ sudo ln -s /usr/local/lib/libodbcinst.so.2 libodbcinst.so.1

The last step is to add the new directory to the
LD_LIBRARY_PATH environment variable (remember we cannot use the ld.so.conf mechanism):

export LD_LIBRARY_PATH=/usr/local/lib/odbc-compat:$LD_LIBRARY_PATH

If you want this path to be automatically added for your login, then you can add the above line to your ~/.bash_login file. If you want this to be system-wide, then instead add it to /etc/profile.

Once all this is done you can follow the remainder of the original instructions without any further modifications.

ODB 1.6.0 released

Tuesday, October 4th, 2011

ODB 1.6.0 was released today.

In case you are not familiar with ODB, it is an object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL, or manually writing any of the mapping code.

This version includes a large number of major new features, small improvements, and bug fixes. For an exhaustive list of changes, see the official ODB 1.6.0 release announcement. As usual, below I am going to examine the most notable new features in more detail.

Views

No doubt the biggest feature in this release is the introduction of the view concept. An ODB view is a C++ class that embodies a light-weight, read-only projection of one or more persistent objects or database tables or the result of a native SQL query execution.

Some of the common applications of views include loading a subset of data members from objects or columns from database tables, executing and handling results of arbitrary SQL queries, including aggregate queries, as well as joining multiple objects and/or database tables using object relationships or custom join conditions.

Many relational databases also define the concept of views. Note, however, that ODB views are not mapped to database views. Rather, by default, an ODB view is mapped to an SQL SELECT query. However, if desired, it is easy to create an ODB view that is based on a database view.

As an example, consider a simple person persistent class:

#pragma db object
class person
{
  ...
 
  #pragma db id auto
  unsigned long id_;
 
  std::string first_;
  std::string last_;
  unsigned short age_;
};

Let’s say we want to define a view that returns the number of people stored in our database:

#pragma db view object(person)
struct person_count
{
  #pragma db column("count(" + person::id_ + ")")
  std::size_t count;
};

And here is how we can use this view to get the total head count:

odb::result<person_count> r (db.query<person_count> ());
 
const person_count& c (*r.begin ()); // Exactly one element.
cout << c.count << endl;

Or we can count people that match only certain criteria. For example, here is how we can find out how many people in our database are younger than 30:

typedef odb::query<person_count> query;
typedef odb::result<person_count> result;
 
result r (db.query<person_count> (query::age < 30));
 
const person_count& c (*r.begin ());
cout << c.count << endl;

ODB views can be defined in terms of one or more persistent objects, database tables, a combination of the two, or as a native SQL query. As a result, there are a lot of different things that can be achieved with views. If you would like to learn more, refer to Chapter 9, “Views” in the ODB Manual. There is also the view example in the odb-examples package.

NULL Semantics

ODB now supports the so-called NULL semantics wrappers which allow us to transform any value type to a type that can have the special NULL state. We can use the standard smart pointers as well as the odb::nullable “optional” container as NULL wrappers. The Boost profile adds support for boost::shared_ptr and boost::optional while the Qt profile adds support for QSharedPointer. We can also use our own smart pointers or “optional” containers as NULL wrappers.

As an example, let’s say we would like to store the optional middle name in our person class from the previous section. Here is how we can do it using std::auto_ptr:

#pragma db object
class person
{
  ...
 
  std::string first_;
 
  #pragma db null
  std::auto_ptr<std::string> middle_;
 
  std::string last_;
};

Now, if we don’t want to incur a dynamic memory allocation just to get the NULL semantics, we can use the odb::nullable container instead:

#include <odb/nullable.hxx>
 
#pragma db object
class person
{
  ...
 
  std::string first_;
  odb::nullable<std::string> middle_;
  std::string last_;
};

Note that here we don’t need the db null pragma since odb::nullable enables NULL by default.

We could also use boost::optional instead of odb::nullable, provided we enable the Boost profile (-p boost ODB compiler option):

#include <boost/optional.hpp>
 
#pragma db object
class person
{
  ...
 
  std::string first_;
  boost::optional<std::string> middle_;
  std::string last_;
};

For more information on this feature, refer to Section 7.3, “NULL Value Semantics” in the ODB manual.

Erase Query

The new erase_query() function allows us to delete the database state of multiple objects matching certain criteria. It uses the same query expression as the query() function. For example, this is how we can delete all the people in our database that are younger than 30:

db.erase_query<person> (odb::query<person>::age < 30)

For more information on this feature, refer to Section 3.10, “Deleting Persistent Objects” in the ODB manual.

BLOB Handling

It is now possible to use the std::vector<char> type to store BLOB data in the database. Note, however, that to enable this mapping, we need to explicitly specify the database type, for example:

#pragma db object
class person
{
  ...
 
  #pragma db type("BLOB")
  std::vector<char> public_key_;
};

Alternatively, we can do it on the per-type basis, for example:

typedef std::vector<char> buffer;
#pragma db value(buffer) type("BLOB")
 
#pragma db object
class person
{
  ...
 
  buffer public_key_; // Mapped to BLOB.
};

Expressive Query Syntax

Prior to this release we used the scope resolution operator (::) when referring to members inside composite values and pointed-to objects in query expressions. For example:

db.query<person> (query::employer::name == "Example, Inc");

The problem with this approach is that it is impossible to say whether the member is inside a composite value or an object just by looking at the expression. In the above example, employee could be a composite value type or a pointer to an object. To make the queries more expressive, we have changed the syntax to use the member access operator (.) when referring to members inside composite value types and to use the member access operator via pointer (->) when referring to members inside related objects. As a result, the above query will look like this if employee is a composite value:

db.query<person> (query::employer.name == "Example, Inc");

And like this, if it is a pointer to an object:

db.query<person> (query::employer->name == "Example, Inc");

Other interesting new features in this release include the --table-prefix ODB compiler option, the odb::connection interface, and support for multiplexing several transactions on the same thread. For more information on these and other features, see the official ODB 1.6.0 release announcement.

Do we need std::buffer?

Tuesday, August 9th, 2011

Or, boost::buffer for starters?

A few days ago I was again wishing that there was a standard memory buffer abstraction in C++. I have already had to invent my own classes for XSD and XSD/e (XML Schema to C++ compilers) where they are used for mapping the XML Schema hexBinary and base64Binary types to C++. Now I have the same problem in ODB (an ORM system for C++) where I need a suitable C++ type for representing database BLOB types. This time I have decided against creating another copy of my own buffer class and instead use the poor man’s “standard” buffer, std::vector<char>, with its unnatural interface and all.

The abstraction I am wishing for is a simple class for encapsulating the memory management of a raw memory buffer plus providing a few common operations, such as memcpy, memset, etc. So instead of writing this:

class person
{
public:
  person (char* key_data, std::size_t key_size)
    : key_size_ (key_size)
  {
    key_data_ = new char[key_size];
    std::memcpy (key_data_, key_data, key_size);
  }
 
  ~person ()
  {
    delete key_data_;
  }
 
  ...
 
  char* key_data_;
  std::size_t key_size_;
};

Or having to create yet another custom buffer class, we could do this:

class person
{
public:
  person (char* key_data, std::size_t key_size)
    : key_ (key_data, key_size)
  {
  }
 
  ...
 
  std::buffer key_;
};

Above I called vector<char> a poor man’s “standard” buffer. But what exactly is wrong with using it to manage a memory buffer? While it works reasonably well functionally, the interface is unnatural and some operations may not be as efficient as we would expect from a memory buffer. Let’s examine the most prominent examples of these issues.

The first problem is with how we access the underlying memory. The C++ standard defect report (DR) 464 added the data() member function to std::vector which returns a pointer to the buffer. However, there are still compilers in use that do not support this, notably GCC 3.4 and VC++ 2008/9.0. As a result, if you want your code to be portable, you will need to use the much less intuitive &b.front() expression:

vector<char> b = ...
memcpy (out, &b.front (), b.size ());

There is also a subtle issue with using front(). While it appears to be legal to call data() on an empty buffer (as long as we don’t dereference the returned pointer), it is illegal to call front(). This means that you may have to handle an empty buffer as a special case, further complicating your code:

vector<char> b = ...
memcpy (out, (b.empty () ? 0 : &b.front ()), b.size ());

The initialization of a buffer is also inconvenient and potentially inefficient. Let’s say we want to have an uninitialized buffer of 1024 bytes which we plan to fill in later. There is no way to do that with vector<char>. The best we can do is to have every byte initialized:

vector<char> b (1024); // Zero-initialized buffer.

If we want to create a buffer initialized with contents of a memory fragment, the interface we have to use is cumbersome:

vector<char> b (data, data + size);

What we want to write instead is this:

buffer b (data, size);

This initialization is also potentially inefficient. Depending on the quality of the implementation, std::vector may end up using a for loop instead of memcpy to copy the data. In fact, that’s exactly how it is done in GCC 4.5 and VC++ 2010/10.0 (Correction: as was pointed out in the comments, both GCC 4.5 and VC++ 10 optimize the case where the vector element is POD).

So I think it is quite clear that while vector<char> is workable, it is not particularly convenient or efficient.

Also, as it turns out this is not the first time I am playing with the idea of a dedicated buffer class in C++. A couple of months ago I started a thread on the Boost developer mailing list trying to see if there would be any interest in a simple buffer library in Boost. The result wasn’t very encouraging. The thread quickly splintered into discussions of various special-purpose, buffer-like data structures that people have in their applications.

On the other hand, I mentioned the buffer class at BoostCon 2011 to a couple of Boost users and got very positive responses, along the “If it were there we would use it!” lines. That’s when I got the idea of writing this article in an attempt to get feedback from the broader C++ community rather than from just the hard-core Boost developers (only they can withstand the boost-dev mailing list traffic).

While the above discussion should give you a pretty good idea about the kind of buffer class I am talking about, below I am going to show a proposed interface and provide a complete, header-only implementation (released under the Boost license), in case you would like to give it a try.

class buffer
{
public:
  typedef std::size_t size_type;
  static const size_type npos = -1;
 
  ~buffer ();
 
  explicit buffer (size_type size = 0);
  buffer (size_type size, size_type capacity);
  buffer (const void* data, size_type size);
  buffer (const void* data, size_type size, size_type capacity);
  buffer (void* data, size_type size, size_type capacity,
          bool assume_ownership);
 
  buffer (const buffer&);
  buffer& operator= (const buffer&);
 
  void swap (buffer&);
  char* detach ();
 
  void assign (const void* data, size_type size);
  void assign (void* data, size_type size, size_type capacity,
               bool assume_ownership);
  void append (const buffer&);
  void append (const void* data, size_type size);
  void fill (char value = 0);
 
  size_type size () const;
  bool size (size_type);
  size_type capacity () const;
  bool capacity (size_type);
  bool empty () const;
  void clear ();
 
  char* data ();
  const char* data () const;
 
  char& operator[] (size_type);
  char operator[] (size_type) const;
  char& at (size_type);
  char at (size_type) const;
 
  size_type find (char, size_type pos = 0) const;
  size_type rfind (char, size_type pos = npos) const;
 
private:
  char* data_;
  size_type size_;
  size_type capacity_;
  bool free_;
};
 
bool operator== (const buffer&, const buffer&);
bool operator!= (const buffer&, const buffer&);

Most of the interface should be self-explanatory. The last overloaded constructor allows us to create a buffer by reusing an existing memory block. If the assume_ownership argument is true, then the buffer object will free the memory using delete[]. The detach() function is the mirror side of this functionality in that it allows us to detach the underlying memory block and reuse it in some other way. After the call to detach() the buffer object becomes empty and we should eventually free the returned memory using delete[]. The size() and capacity() modifiers return true to indicate that the underlying buffer address has changed, in case we cached it somewhere.

So, do you think we need something like this in Boost and perhaps in the C++ standard library? Do you like the proposed interface?