Archive for the ‘C++’ Category

Delaying function signature instantiation in C++11

Tuesday, March 20th, 2012

I think everyone had enough of rvalue references for now so let’s look at another interesting C++11 technique: delayed function signature instantiation. It is made possible thanks to the default function template arguments.

To understand the motivation behind this technique, let’s first review the various stages of instantiation of a class template. At the first stage all we get is just the template-id. Here is an example:

template <typename T>
class foo;
 
class bar;
 
typedef foo<bar> foo_bar;

At this stage both the template and its type arguments only need to be forward-declared and the resulting template-id can be used in places where the size of a class nor its members need to be known. For example, to form a pointer or a reference:

foo_bar* p = 0;     // ok
void f (foo<bar>&); // ok
foo_bar x;          // error: need size
p->f ();            // error: foo<bar>::f is unknown

In other words, this is the same as forward-declaration for non-template classes.

The last two lines in the above example wouldn’t have been errors if we had defined the foo class template. Instead, it would have triggered the second instantiation stage during which the class definition (i.e., its body) is instantiated. In particular, this includes instantiation of all data members and member function signatures. However, this stage does not involve instantiation of member function bodies. This only happens at the third stage, when we actually use (e.g., call or take a pointer to) specific functions. Here is another example that illustrates all the stages together:

template <typename T>
class foo
{
public:
  void f (T* p)
  {
    delete p_;
    p_ = p;
  }
 
  T* p_;
};
 
class bar;
 
void f (foo<bar>&); // stage 1
foo<bar> x;         // stage 2
x.f ();             // stage 3

While the class template definition is required for the second stage and the function definition is required for the third stage, whether the type template arguments must be defined at any of these stages depends on the template implementation. For example, the foo class template above does not require the template argument to be defined during the second stage but does require it to be defined during the third stage when f()’s body is instantiated.

Probably the best known example of class templates that don’t require the template argument to be defined during the second stage are smart pointers. This is because, like with raw pointers, we often need to form smart pointers to forward-declared types:

class bar;
 
bar* create ();                 // ok
std::shared_ptr<bar> create (); // ok

It is fairly straightforward to implement normal smart pointers like std::shared_ptr in such a way as to not require the template argument to be defined. But here is a problem that I ran into when implementing a special kind of smart pointer in ODB, called a lazy pointer. If you read some of my previous posts you probably remember what a lazy pointer is (it turned out to be a very fertile ground for discovering interesting C++11 techniques). For those new to the idea, here is quick recap: when an object that contains lazy pointers to other objects is loaded from the database, these other objects are not loaded right away (which would be the case for normal, eager pointers such as std::shared_ptr). Instead, just the object ids are loaded and the objects themselves can be loaded later, when and if required.

A lazy pointer can be initialized with an actual pointer to a persistent object, in which case the pointer is said to be loaded. Or we can initialize it with an object id, in which case the pointer is unloaded.

When I first set out to implement a lazy pointer, I naturally added the following extra constructor to support creating unloaded pointers (in reality id_type is not defined by T but rather by odb::object_traits<T>; however this difference is not material to the discussion):

template <class T>
class lazy_shared_ptr
{
  lazy_shared_ptr (database&, const typename T::id_type&);
 
  ...
};

Do you see the problem? Remember that during the second stage function signatures get instantiated. And in order to instantiate the signature of the above constructor, the template argument must be defined, since we are looking for id_type inside this type. As a result, lazy_shared_ptr can no longer be used with forward-declared classes.

As it turns out, we can delay function signature instantiation until the third stage (i.e., when the function is actually used) by making the function itself a template. Here is how we can fix the above constructor so that we can continue using lazy_shared_ptr with forward-declared types. This method works even in C++98:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <typename ID>
  lazy_shared_ptr (database&, const ID&);
};

As a side note, some of you who read my previous posts about rvalue references were wondering why I used the constructor template here. Well, now you know.

The above C++98-compatible implementation has a number of drawbacks. The biggest is that we cannot use this technique for function return types. In ODB, lazy pointers also allow querying the object id of a stored object. In the C++98 mode, to keep the implementation usable on forward-declared types, I had to resort to this ugly interface:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <typename T1>
  typename T1::id_type object_id () const;
};
 
lazy_shared_ptr<object> lp = ...
cerr << lp->object_id<object> (); << endl;

That is, the user has to explicitly specify the object type when calling object_id().

The second problem has to do with the looseness of the resulting interface. Now we can pass any value as id when initializing lazy_shared_ptr. While an incompatible type will get caught, it will only happen in the implementation with the resulting diagnostics pointing to the wrong place and saying the wrong thing (we have to provide our own correct “diagnostics” in the comment):

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <typename ID>
  lazy_shared_ptr (database&, const ID& id)
  {
    // Compiler error pointing here? Perhaps the id
    // argument is wrong?
    //
    const typename T::id_type& real_id (id);
    ...
  }
};

Support for default function template arguments in C++11 allows us to resolve both of these problems. Let’s start with the return type:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <typename T1 = T>
  typename T1::id_type object_id () const;
};
 
lazy_shared_ptr<object> lp = ...
cerr << lp->object_id (); << endl;

The solution to the second problem is equally simple:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <typename T1 = T>
  lazy_shared_ptr (database&, const typename T1::id_type&);
};

The idea here is to inhibit template argument deduction in order to force the default type to always be used. This is similar to the trick used in std::forward().

Rvalue reference pitfalls, an update

Wednesday, March 14th, 2012

My original post about rvalue reference pitfalls from last week was followed by quite a few comments, including some interesting suggestions that are worth discussing.

While most of the discussion centered around the second problem, Jonathan Rogers pointed out the following interesting observation. Consider again the lazy_shared_ptr constructor from the original article that takes shared_ptr:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>& p)
    : p_ (p)
  {
  }
};

If we want to support efficient initialization (shall we call it move initialization?), then it seems natural to add an rvalue reference overload:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>& p)
    : p_ (p)
  {
  }
 
  template <class T1>
  lazy_shared_ptr (database&, std::shared_ptr<T1>&& p)
    : p_ (std::move (p))
  {
  }
};

While this works, as Jonathan pointed out, another alternative to provide the same functionality is to just have a single constructor that takes its argument by value:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class T1>
  lazy_shared_ptr (database&, std::shared_ptr<T1> p)
    : p_ (std::move (p))
  {
  }
};

Let’s consider what happens when we use both versions to initialize lazy_shared_ptr with lvalues and rvalues. When we use the original implementation with an lvalue, the first constructor (the one taking const lvalue reference) is selected. The value is then copied to p_ using shared_ptr’s copy constructor. Using the second implementation with an lvalue causes that copy constructor to be called right away to create the temporary. This temporary is then passed to the lazy_shared_ptr constructor where it is moved to p_. So in this case the second implementation requires an extra move constructor call.

Let’s now pass an rvalue. In the first implementation the second constructor is selected and the value is passed as an rvalue reference. It is then moved to p_. When the second implementation is used, a temporary is again created but this time using a move instead of a copy constructor. The temporary is then moved to p_. In this case, again, the second implementation requires an extra move constructor call.

Considering that move constructors are normally very cheap, this makes for a good way to keep your code short and concise. But the real advantage of this approach becomes apparent when we have multiple arguments that we want to pass efficiently (this was also a topic of Sumant’s post from a few days ago). If we use the rvalue reference approach, then for n arguments we will need 2^n constructor versions.

Note, however, that the pass by value approach is only a good idea if you know for sure that the argument type provides a move constructor. If that’s not the case, then this approach will perform significantly worse compared to the rvalue reference version. This is the reason, for example, why it is not a good idea to use this technique in std::vector’s push_back().

Ok, let’s now turn to the problem that triggered a lot of comments and suggestions. Here is a quick recap. We have two constructors like these:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class ID>
  lazy_shared_ptr (database&, const ID&);
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>&);
};

One initializes a lazy pointer using an object id, creating an unloaded pointer. The other initializes it with the pointer to the actual object, creating a loaded pointer.

Now we want to add move initialization overloads for these two constructors. As it turns out, the straightforward approach doesn’t quite work:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class ID>
  lazy_shared_ptr (database&, const ID&);
 
  template <class ID>
  lazy_shared_ptr (database&, ID&&);
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>&);
 
  template <class T1>
  lazy_shared_ptr (database&, std::shared_ptr<T1>&&);
};

As you may recall, we have two problems here. The first manifests itself when we try to initialize lazy_shared_ptr with an lvalue of the shared_ptr type:

shared_ptr<object> p = ...;
lazy_shared_ptr<object> lp (db, p);

Instead of selecting the third constructor, the overload resolution rules select the second because the rvalue reference in its second argument becomes an lvalue reference (see the original post for details on why this happens).

The second problem occurs when we try to initialize a lazy pointer with an object id that is again an lvalue. For example:

string id = ...
lazy_shared_ptr<object> lp (db, id);

In this case, instead of selecting the first constructor, the overload resolution again selects the second constructor which is again transformed to a version that has an lvalue instead of an rvalue reference for its second argument. If you are wondering why this is a problem (after all, the first two constructors accomplish essentially the same), consider that while the first constructor’s hypothetical implementation will use a copy constructor to initialize the id, the second constructor will most likely use the move constructor to accomplish the same. Which means that the state of our lvalue will be transferred without us explicitly asking for it, as we normally do with the std::move() call.

As Thomas noted in the comments and as I should have mentioned explicitly in the original post, the C++ mechanisms that are causing problems here are exactly the same ones that allow for perfect argument forwarding. In fact, rvalue references are primarily used to implement two related but also quite distinct things: the move semantics and perfect forwarding. What happens here is that we are trying to implement the move semantics but are getting perfect forwarding instead. To paraphrase the conclusion of my original post, any time you write a function like this:

template <typename T>
void f (T&&);

You always get perfect forwarding and never move semantics.

If we look closely at the two problematic cases above, we will notice that they both happen when the template argument is an lvalue reference which results in our rvalue reference becoming lvalue. When we pass an rvalue, everything works great. In fact, we never want our move initialization constructor to be called for lvalues since we have other overloads (const lvalue reference) taking care of these cases. So what we want is to disable the move initialization constructor when the template argument is an lvalue reference. As it turns out, this is not that difficult in C++11:

  template <class ID,
            typename std::enable_if<
              !std::is_lvalue_reference<ID>::value,
              int>::type = 0>
  lazy_shared_ptr (database&, ID&&)

To put this in more general terms, it is possible to reduce perfect forwarding back to just move semantics by disabling a function for template arguments that are lvalue references.

Rvalue reference pitfalls

Tuesday, March 6th, 2012

I just finished adding initial C++11 support to ODB (will write more about that in a separate post) and, besides other things, this involved using rvalue references, primarily to implement move constructors and assignment operators. While I talked about some of the tricky aspects of rvalue references in the “Rvalue references: the basics” post back in 2008 (it is also a good general introduction to the subject), the experience of writing real-world code that uses this feature brought a whole new realization of the potential pitfalls.

Probably the most important thing to always keep in mind when working with rvalue references is this: once an rvalue reference is given a name, it becomes an lvalue reference. Consider this function:

void f (int&& x)
{
}

What is the type of the argument in this function? It is an rvalue reference to int (i.e., int&&). And what is the type of the x variable in f()’s body? It is a normal (lvalue) reference to int (i.e., int&). This takes some getting used to.

Let’s see what happens when we forget about this rule. Here is a naive implementation of a move constructor in a simple class:

struct base
{
  base (const base&);
  base (base&&);
};
 
struct object: base
{
  object (object&& obj)
      : base (obj),
        nums (obj.nums)
  {
  }
 
private:
  std::vector<int> nums;
};

Here, instead of calling move constructors for base and nums we call their copy constructors! Why? Because obj is an lvalue reference, not an rvalue one. The really bad part to this story is this: if you make such a mistake, there will be no compilation or runtime error. It will only manifest itself as sub-optimal performance which can easily go unnoticed for a long time.

So how do we fix this? The fix is to always remember to convert the lvalue reference back to rvalue with the help of std::move():

  object (object&& obj)
      : base (std::move (obj)),
        nums (std::move (obj.nums))
  {
  }

What if one of the members doesn’t provide a move constructor? In this case the copy constructor will silently be called instead. This can also be sub-optimal or even plain wrong, for example, in case of a raw pointer. If the member’s type provides swap(), then this can be a good backup plan:

  object (object&& obj)
      : base (std::move (obj))
  {
    nums.swap (obj.nums);
  }

Ok, that was a warmup. Ready for some heavy lifting? Let’s start with this simple code fragment:

typedef int& rint;
typedef rint& rrint;

What is rrint? Right, it is still a reference to int. The same logic holds for rvalue references:

typedef int&& rint;
typedef rint&& rrint;

Here rrint is still an rvalue reference to int. Things become more interesting when we try to mix rvalue and lvalue references:

typedef int&& rint;
typedef rint& rrint;

What is rrint? Is it an rvalue, lvalue, or some other kind of reference (lrvalue reference, anyone)? The correct answer is it’s an lvalue reference to int. The general rule is that as soon as we have an lvalue reference anywhere in the mix, the resulting type will always be an lvalue reference.

You may be wondering why on earth would anyone create an lvalue reference to rvalue reference or an rvalue reference to lvalue reference. While you probably won’t do it directly, this can happen more often than one would think in template code. And I don’t think the resulting interactions with other C++ mechanisms, such as automatic template argument deductions, are well understood yet.

Here is a concrete example from my work on C++11 support in ODB. But first a bit of context. For standard smart pointers, such as std::shared_ptr, ODB provides lazy versions, such as odb::lazy_shared_ptr. In a nutshell, when an object that contains lazy pointers to other objects is loaded from the database, these other objects are not loaded right away (which would be the case for normal, eager pointers). Instead, just the object ids are loaded and the objects themselves can be loaded later, when and if required.

A lazy pointer can be initialized with an actual pointer to a persistent object, in which case the pointer is said to be loaded. Or we can initialize it with an object id, in which case the pointer is unloaded. Here are the signatures of the two constructors in question:

template <class T>
class lazy_shared_ptr
{
  ...
 
  template <class ID>
  lazy_shared_ptr (database&, const ID&);
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>&);
};

Seeing that we now have rvalue references, I’ve decided to go ahead and add move versions for these two constructors. Here is my first attempt:

  template <class ID>
  lazy_shared_ptr (database&, const ID&);
 
  template <class ID>
  lazy_shared_ptr (database&, ID&&);
 
  template <class T1>
  lazy_shared_ptr (database&, const std::shared_ptr<T1>&);
 
  template <class T1>
  lazy_shared_ptr (database&, std::shared_ptr<T1>&&);

Let’s now see what happens when we try to create a loaded lazy pointer:

shared_ptr<object> p (db.load<object> (...));
lazy_shared_ptr<object> lp (db, p);

One would expect that the third constructor will be used in this fragment but that’s not what happens. Let’s see how the overload resolution and template argument deduction work here. The type of the second argument in the lazy_shared_ptr constructor call is shared_ptr<object>& and here are the signatures that we get:

lazy_shared_ptr (database&, const shared_ptr<object>&);
lazy_shared_ptr (database&, (shared_ptr<object>&)&&);
lazy_shared_ptr (database&, const shared_ptr<object>&);
lazy_shared_ptr (database&, shared_ptr<object>&&);

Take a closer look at the second signature. Here the template argument is an lvalue reference. On top of that we add an rvalue reference. But, as we now know, this is still just an lvalue reference. So in effect our candidate list is as follows and, unlike our expectations, the second constructor is selected:

lazy_shared_ptr (database&, const shared_ptr<object>&);
lazy_shared_ptr (database&, shared_ptr<object>&);
lazy_shared_ptr (database&, const shared_ptr<object>&);
lazy_shared_ptr (database&, shared_ptr<object>&&);

In other words, the second constructor, which was supposed to take an rvalue reference was transformed to a constructor that takes an lvalue reference. This is some profound stuff. Just think about it: given its likely implementation, this constructor can now silently “gut” an lvalue without us ever indicating this desire with an explicit std::move() call.

So how can we fix this? My next attempt was to strip the lvalue reference from the template argument, just like std::move() does for its return value:

  template <class ID>
  lazy_shared_ptr (
    database&,
    typename std::remove_reference<ID>::type&&);

But this inhibits template argument deduction and there is no way (nor desire, in my case) to specify template arguments explicitly for constructor templates. So, in effect, the above constructor becomes uncallable.

So what did I do in the end? Nothing. There doesn’t seem to be a way to provide such a move constructor. The more general conclusion seems to be this: function templates with arguments of rvalue references that are formed directly from template arguments can be transformed, with unpredictable results, to functions that have lvalue references instead. Preventing this using std::remove_reference will inhibit template argument deduction.

Update: I have written a follow up to this post that discusses some of the suggestions left in the comments as well as presents a solution for the above problem.