Archive for February, 2012

Who calls this function?

Wednesday, February 29th, 2012

Let’s say we have a large project and we want to find out from which places in our code a particular function is called. You may be wondering why would you want to know? The most common reason is to eliminate dead code; if the function is not called, then it may not be needed anymore. Or maybe you just want to refresh your memory about this area of your code base. The case that triggered this post involved changing one function call to another. I was adding support for composite object ids in ODB and was gradually changing the code to use a more generalized version of a certain function while still maintaining the old version for the code still to be ported. While I knew about most of the areas that needed changing, in the end I needed to verify that nobody was calling the old function and then remove it.

So how do we find out who calls a particular function? The method that I am sure most of you have used before is to comment the function out, recompile, and use the C++ compiler error messages to pin-point the calls. There are a few problems with this approach, however. First of all, depending on your build system, the compilation may stop before showing you all the call sites (make -k is helpful here but is still not a bulletproof solution). So to make sure that you have seen all the places, you may also have to keep commenting the calls and recompiling until you get no more errors. This is annoying.

This approach will also not work if a call can be resolved to one of the overloaded versions. This was exactly the situation I encountered. I had two functions that looked like this:

class traverser
{
  void traverse (type&);   // New version.
  void traverse (class_&); // Old version.
};

Where class_ derives from type so if I commented the old version out, the calls were happily resolved to the new version without giving any errors.

Another similar situation is when you have a function in the outer namespace that will be used if you comment a function in the inner namespace:

void f ();
 
namespace n
{
  void f ();
 
  void g ()
  {
    // Will resolve to outer f() if inner f() is
    // commented out.
    //
    f ();
  }
}

What’s worse is that in complex cases involving implicit conversions of arguments, some calls may be successfully resolved to an overloaded or outer version while some will trigger an error. As a result, you may not even realize that you didn’t see all the call sites.

Ok, so that approach didn’t work in my case. What else can we try? Another option is to just comment the definition of the function out and see if we get any unresolved symbol errors during linking. There are many problems with this method as well. First of all, if the function in question is virtual, then this method won’t work because the virtual function table will always contain a reference to the function. Plus, all the calls to this function will go through the vtable.

If the function is not virtual, then, at best, a linker will tell you that there is an undefined reference in a specific function in a specific translation unit. For example, here is an output from the GNU Binutils ld:

/tmp/ccXez0jI.o: In function `main':
test.cxx:(.text+0×10): undefined reference to `f()'
test.cxx:(.text+0×15): undefined reference to `f()'

In particular, there will be no line information so if a function calls the function of interest multiple times, we will have no way of knowing which call triggered the undefined symbol.

This approach also won’t work if we are building a shared library (unless we are using the -no-undefined or equivalent option) because the undefined reference won’t be reported until we link the library to an executable or try to load it (e.g., with dlopen()). And when that happens all we will get is just a note that there is an undefined reference in a library:

libtest.so: undefined reference to `f()'

In my case, since ODB is implemented as a shared library, all this method did was to confirm that I still had a call to the old version of the function. I, however, had no idea even which file(s) contained these calls.

As it happens, just the day before I was testing ODB with GCC in the C++11 mode. While everything worked fine, I got a few warnings about std::auto_ptr being deprecated. As I saw them scrolling by, I made an idle note to myself that when compiled in the C++11 mode libstdc++ probably marks auto_ptr using the GCC deprecated attribute. A day later this background note went off like a light bulb in my head: I can mark the old version of the function as deprecated and GCC will pin-point with a warning every single place where this function is called:

class traverser
{
  void traverse (type&);
 
  void traverse (class_&) __attribute__ ((deprecated));
};

And the diagnostics is:

model.cxx: In function ‘void object_columns::traverse(data_member&)’:
model.cxx:22:9: warning: ‘void traverser::traverse(class_&)’ is
deprecated

This method is also very handy to find out which overloaded version was selected by the compiler without resolving to the runtime test:

void f (bool) __attribute__ ((deprecated));
void f (int) __attribute__ ((deprecated));
void f (double) __attribute__ ((deprecated));
 
void g ()
{
  f (true);
  f (123);
  f (123.1);
}

And the output is:

test.cxx:7:10: warning: ‘void f(bool)’ is deprecated
test.cxx:8:9: warning: ‘void f(int)’ is deprecated
test.cxx:9:11: warning: ‘void f(double)’ is deprecated

The obvious drawback of this method is that it relies on a GCC-specific extension, though some other compilers (Clang and probably Intel C++ for Linux) also support it. If you know of a similar functionality in other compilers and/or IDE’s, please mention it in the comments.

Accessing static members via an instance

Tuesday, February 21st, 2012

We all know about accessing static members using a class name:

class c
{
public:
  static void f ();
  static int i;
};
 
c::f ();
c::i++;

But did you know that we can also access them using a class instance, just like we would ordinary, non-static members?

c x;
 
x.f ();
x.i++;

This always seemed weird to me since a static member doesn’t depend on an instance and, in particular, since a static function does not have the this pointer. I was always wondering what this feature could be useful for. My best guess was some template metaprogramming technique where we don’t know whether a member is static or not. However, I’ve never seen any actual code that relied on this.

Until recently, that is, when I found a perfect use for this feature (this is one of those few benefits of knowing obscure C++ language details; once in a while a problem arises for which you realize there is a quirky but elegant solution).

But first we need a bit of a background on the problem. You may have heard of ODB, which provides object-relational mapping (ORM) for C++. ODB has a C++-integrated query language that allows us to query for persistent objects using a familiar C++ syntax instead of SQL. In other words, ODB query language is a domain-specific language (DSL) embedded into C++. Here is a simple example:

class person
{
  ...
 
  std::string first_;
  std::string last_;
  unsigned short age_;
};

Given this persistent class we can perform queries like this:

typedef odb::query<person> query;
typedef odb::result<person> result;
 
result r (db.query (query::last == "Doe" && query::age < 30));

Here is how this is implemented (in slightly simplified terms): for the person class the ODB compiler will generate the odb::query template specialization that contains static “query columns” corresponding to the data members in the class, for example:

// Generated by the ODB compiler.
//
namespace odb
{
  template <>
  class query<person>
  {
    static query_column<std::string> first;
    static query_column<std::string> last;
    static query_column<unsigned short> age;
  };
}

In turn, the query_column class template overloads various C++ operators (==, !=, <, etc) that translate a C++ expression such as:

query::last == "Doe" && query::age < 30

To an SQL WHERE clause that looks along these lines:

last = $1 AND age < $2

And pass "Doe" for $1 and 30 for $2.

This design worked very well until we needed to add support for composite values and object pointers:

#pragma db object
class employer
{
  ...
 
  std::string name_;
};
 
#pragma db value
struct name
{
  std::string first_;
  std::string last_;
};
 
#pragma db object
class person
{
  ...
 
  name name_;
  unsigned short age_;
  shared_ptr<employer> employer_;
};

The first version of the query language with support for composite values and object pointers used nested scopes to represent both. The generated odb::query specializations in this version would look like this:

namespace odb
{
  template <>
  class query<employer>
  {
    static query_column<std::string> name;
  };
 
  template <>
  class query<person>
  {
    struct name
    {
      static query_column<std::string> first;
      static query_column<std::string> last;
    };
 
    static query_column<unsigned short> age;
 
    typedef query<employer> employer;
  };
}

And an example query would look like this:

query::name::last == "Doe" && query::employer::name == "Example, Inc"

The problem with this query is that it is not very expressive; by looking at it, it is not clear whether the name and employer components correspond to composite values or object pointers. Plus, it doesn’t mimic C++ very well. In C++ we would use the dot operator (.) to access a member in a instance, for example, name.last. Similarly, we would use the arrow operator (->) to access a member via a pointer, for example, employer->name. So what we would want then is to be able to write the above query expression like this:

query::name.last == "Doe" && query::employer->name == "Example, Inc"

Now it is clear that name is a composite value while employer is an object pointer.

The question now is how can we adapt the odb::query specialization to provide this syntax. And that’s where the ability to access a static data member via an instance fits right in. Let’s start with the composite member:

  template <>
  class query<person>
  {
    struct name_type
    {
      static query_column<std::string> first;
      static query_column<std::string> last;
    };
 
    static name_type name;
 
    ...
  };

query::name is now a static data member and we use the dot operator in query::name.last to access its static member.

Things get even more interesting when we consider object pointers. Remember that here we want to use the arrow operator to access nested members. To get this syntax we create this curiously looking, smart pointer-like class template:

template <typename T>
struct query_pointer
{
  T* operator-> () const
  {
    return 0; // All members in T are static.
  }
};

For fun, try showing it to your friends or co-workers and ask them what it could be useful for. Just remember to remove the comment after the return statement ;-). Here is how we use this class template in the odb::query specialization:

  template <>
  class query<person>
  {
    ...
 
    static query_pointer< query<employer> > employer;
  };

When the arrow operator is called in query::employer->name, it returns a NULL pointer. But that doesn’t matter since the member we are accessing is static and the pointer is not used.

If you know of other interesting use cases for the static member access via instance feature, feel free to share them in the comments.

Updated ODB benchmark results

Thursday, February 2nd, 2012

In the release announcement for ODB 1.8.0 I have mentioned some performance numbers when using ODB with SQL Server. If you read that post you probably remember that, to put it mildly, the numbers for SQL Server didn’t look that good compared to other databases, especially on the API overhead benchmark.

In fact, the numbers were so bad that it made me suspect something else is going on here, not just poor ODBC, Native Client, or SQL Server performance. One major difference between the SQL Server test setup and other databases is the use of virtual machines. While all the other databases and tests were running on real hardware, SQL Server was running on a KVM virtual machine. So to make the benchmark results more accurate I decided to re-do all the tests on real, identical hardware.

Hi-end database hardware doesn’t normally lay around unused so I had to settle for a dual CPU, quad-core AMD Opteron 265 1.8 Ghz machine with 4GB or RAM and U320 15K Seagate Cheetah SCSI drives. While this is the right kind of hardware for a database server, it would be a very entry-level specification by today’s standards. So keep that in mind when I show the numbers below; here we are not after absolute values but rather a comparison between different database implementations, their client APIs, and ODB runtimes for these databases.

The above machine dual-boots to either Debian GNU/Linux with the Linux kernel 2.6.32 or to Windows Server 2008R2 SP1 Datacenter Edition. MySQL 5.5.17, PostgreSQL 9.1.2, and SQLite 3.7.9 run on Debian while SQL Server 2008R2 runs on Windows Server. The tests were built using g++ 4.6.2 for GNU/Linux and VC++ 10 for Windows. Some benchmarks were run on remote client machines all of which are faster than the database server. The server and clients were connected via gigabit switched ethernet.

The first benchmark that we normally run is the one from the Performance of ODB vs C# ORMs post. Essentially we are measuring how fast we can load an object with a couple of dozen members from the database. In other words, the main purpose of this test is to measure the overhead incurred by all the intermediary layers between the object in the application’s memory and its database state, and not the database server performance itself. Specifically, the layers in question are the ODB runtime, database access API, and transport layer.

Since the transport layer can vary from application to application, we ran this benchmark in two configurations: remote and local (expect for SQLite, which is an embedded database). In the remote configuration the benchmark application and the database server are on different machines connected via gigabit ethernet using TCP. In the local configuration the benchmark and the database are on the same machine and the database API uses the most efficient communication medium available (UNIX sockets, shared memory, etc).

The following table shows the average time it takes to load an object, in microseconds. For SQL Server we have two results for the remote configuration: one when running the client on Windows and the other — on GNU/Linux.

Database Remote Local
MySQL 260μs 110μs
PostgreSQL 410μs 160μs
SQL Server/Windows Client 310μs 130μs
SQL Server/Linux Client 240μs
SQLite 30μs

For comparison, the following table lists the local configuration results for some of the databases when tested on more modern hardware (2-CPU, 8-core 2.27Ghz Xeon E5520 machine):

Database Local
MySQL 55μs
PostgreSQL 65μs
SQLite 17μs

If you would like to run the benchmark on your setup, feel free to download the benchmark source code and give it a try. The accompanying README file has more information on how to build and run the test.

Now, let’s look at the concurrent access performance. To measure this we use an update-heavy, highly-contentious multi-threaded test from the ODB test suite, the kind you run to make sure things work properly in multi-threaded applications (see odb-tests/common/threads if you are interested in details). To give you an idea about the amount of work done by the test, it performs 19,200 inserts, 6,400 updates, 19,200 deletes, and 134,400 selects concurrently from 32 threads all on the same table. It is customary for this test to push the database server CPU utilization to 100% on all cores. For all the databases, except SQLite, we ran this test in the remote configuration to make sure that each database has exactly the same resources available.

The following table shows the times it takes each database to complete this test, in seconds.

Database Time
MySQL 98s
PostgreSQL 92s
SQL Server 102s
SQLite 154s

You may have noticed that the above tables are missing an entry for Oracle. Unfortunately, Oracle Corporation doesn’t allow anyone to publish any hard performance numbers about its database. To give you some general indications, however, let me say that Oracle 11.2 Enterprise Edition performs better than any of the other databases listed above in all the tests except for the first benchmark in the local configuration where it came very close to the top client-server performer (MySQL). In particular, in the second benchmark Oracle performed significantly better than all the other databases tested.

Let me also note that these numbers should be taken as indications only. It is futile to try to extrapolate some benchmark results to your specific application when it comes to databases. The only reliable approach is to create a custom test that mimics your application’s data, concurrency, and access patterns. Luckily, with ODB, creating such a test is a very easy job. You can use the above-mentioned benchmark source code as a starting point.