Archive for May, 2012

Perfect forwarding and overload resolution

Wednesday, May 30th, 2012

One of the things that are made possible in C++11 thanks to rvalue references is perfect forwarding. I am sure you’ve already read about this feature and seen some examples. However, as with many new features in C++11, when we move beyond toy code and start using them in real applications, we often discover interactions with other language features that weren’t apparent initially. Interestingly, I’ve already run into a somewhat surprising side effect of perfect forwarding, which I described in the “Rvalue reference pitfalls” post. Today I would like to explore another such interaction, this time between perfect forwarding and overload resolution.

If we have just have one function (or constructor, etc.) that implements perfect forwarding, then the compiler’s job in resolving a call to such a function is straightforward. However, if a forwarding function is part of an overloaded set, then things can get interesting. Suppose we have the following existing set of overloaded functions:

void f (int);
void f (const std::string&);

And here are some sample calls of these functions:

int i = 1;
std::string s = "aaa";
 
f (i);     // f(int)
f (2);     // f(int)
f (s);     // f(string)
f ("bbb"); // f(string)

Suppose we now want to add a “fallback” function that forwards the call to some other function g(). The idea here is that our fallback function should only be used if none of the existing functions can be called. Sounds easy, right?

template <typename T>
void f (T&& x)
{
  g (x);
}

With this addition, let’s see which functions get called for the above example:

f (i);     // f(int)
f (2);     // f(int)
f (s);     // f(T&&) [T = std::string&]
f ("bbb"); // f(T&&) [T = const char (&)[4]]

Whoa, what just happen? This is definitely not something that we want. We expect the C++ compiler to select the most specialized functions which in this case means the compiler should prefer non-templates over our forwarding function. As it turns out, this is still the rule but things get a bit more complex because of the perfect forwarding. Remember that the whole idea of perfect forwarding is that we get the perfect parameter type for whatever argument we pass. The C++ compiler will still choose the most specialized function (non-template in our case) over the forwarding function if it has the parameter type that matches the argument perfectly. Otherwise the forwarding function is a better match.

With this understanding it is easy to explain the above example. In the first call the argument type is int&. Both f(int) and f(int&) (signature of the forwarding function after reference collapsing) match perfectly and the former is selected as the most specialized. The same logic applies to the second call except that the signature of the forwarding function becomes f(int&&).

The last two calls are more interesting. The argument type of the second last call is std::string&. Possible matches are f(const std::string&) and f(std::string&) (again, signature of the forwarding function after reference collapsing). In this case, the second signature is a better match.

In the last call, the argument type is const char (&)[4] (reference to an array of four constant characters). While the forwarding function is happy to oblige again, f(const std::string&) can only be applied with implicit conversion. So it is not even considered! Note something else interesting here: it is perfectly plausible that the call to g() inside forwarding f() will also require an implicit conversion. But that fact is not taken into account during overload resolution of the call to f(). In other words, perfect forwarding can hide implicit conversions.

Ok, now we know why our code doesn’t work. The question is how do we fix it? From experience (see the link above), it seems that the best way to resolve such issues is to disable the forwarding function for certain argument types using std::enable_if. Ideally, the test that we would like to perform in plain English would sound like this: “can any of the non-forwarding functions be called with this argument type”? If the answer is yes, then we disable the forwarding function. Here is the outline:

template <typename T,
          typename std::enable_if<!test<T>::value, int>::type = 0>
void f (T&& x)
{
  ...
}

Unfortunately, there doesn’t seem to be a way to create such a test (i.e., there is no way to exclude the forwarding function from the overload resolution; though if you know how to achieve something like this, do tell). The next best thing is to test whether the argument type is the same as or can be implicitly converted to the parameter type of a non-forwarding function. Here is how we can fix our code using this method:

template <typename T,
          typename std::enable_if<
            !(std::is_same<T, std::string&>::value ||
              std::is_convertible<T, std::string>::value), int>::type = 0>
void f (T&& x)
{
  ...
}

Besides looking rather hairy, the obvious disadvantage of this approach is that we have to do this test for every non-forwarding function. To make the whole thing tidier we can create a helper that allows us to test multiple types at once:

template <typename T,
          typename disable_forward<T, std::string, std::wstring>::type = 0>
void f (T&& x)
{
  ...
}

Here is the implementation of disable_forward using C++11 variadic templates:

template <typename T, typename T1, typename ... R>
struct same_or_convertible
{
  static const bool value =
    std::is_same<T, T1>::value ||
    std::is_convertible<T, T1>::value ||
    same_or_convertible<T, R...>::value;
};
 
template <typename T, typename T1>
struct same_or_convertible<T, T1>
{
  static const bool value =
    std::is_same<T, T1>::value ||
    std::is_convertible<T, T1>::value;
};
 
template <typename T, typename ... R>
struct disable_forward
{
  static const bool value = same_or_convertible<
    typename std::remove_reference<
      typename std::remove_cv<T>::type>::type,
    R...>::value;
 
  typedef typename std::enable_if<value, int> type;
};

And the overall moral of the story is this: when using perfect forwarding watch out for unexpected interactions with other language features.

C++11 range-based for loop

Wednesday, May 16th, 2012

On the surface, the new range-based for loop may seem like a simple feature, perhaps the simplest of all the core language changes in C++11. However, like with most higher-level abstractions, there are quite a few nuances once we start digging a little bit deeper. So in this post I am going to dig a little bit deeper with the intent to get a better understanding of this feature as well as the contexts in which it can and cannot be used.

The range-based for loop has the following form:

for ( declaration : expression ) statement

According to the standard, this is equivalent to the the following plain for loop:

1   {
2     auto&& __range = expression;
3     for (auto __begin = begin-expression,
4               __end = end-expression;
5          __begin != __end;
6          ++__begin)
7     {
8       declaration = *__begin;
9       statement
10    }
11  }

Note that when the standard says equivalent, it means that the resulting logic is equivalent and not that this is the actual translation; in particular, the variable names (e.g., __range, __begin, etc.) are for exposition only and cannot be referred to by the application.

Ok, the equivalent plain for loop version looks quite a bit more complicated compared to the range-based one. Let’s start our examination with the __range initialization (line 2). We use automatic type deduction to determine the type of the range variable based on the initializing expression. Note also that the resulting variable is made an r-value reference. This is done to allow us to iterate over temporaries without making any copies and without imposing additional const restrictions. To see where the use of the r-value reference becomes important, consider this example:

std::vector<int> f ();
 
for (int& x: f ())
  x = 0;

What can we have for the expression? Well, it can be a standard container, an array, a brace initializer list (in which case __range will be std::initializer_list), or anything that supports the concept of iteration by providing suitable begin() and end() functions. Here are a few examples:

int primes[] = {1, 2, 3, 5, 7, 11};
 
for (int x: primes)
  ...;
 
for (int x: {1, 2, 3, 5, 7, 11})
  ...;
 
template <typename T>
struct istream_range
{
  typedef std::istream_iterator<T> iterator_type;
 
  istream_range (std::istream& is): is_ (is) {}
 
  iterator_type begin () const
  {
    return iterator_type (is_);
  }
 
  iterator_type end () const
  {
    return iterator_type ();
  }
 
private:
  std::istream& is_;
};
 
for (int x: istream_range<int> (cin))
  ...;

The begin-expression and end-expression (lines 3 and 4) are determined as follows:

  • If expression is an array, then begin-expression and end-expression are __range and __range + __bound, respectively, where __bound is the array bound.
  • If expression is of a class type that declares begin() and end() member functions, then begin-expression and end-expression are __range.begin() and __range.end(), respectively.
  • Otherwise, begin-expression and end-expression are begin(__range) and end(__range), respectively, where the begin() and end() functions are looked up using the argument-dependent lookup (ADL) which also includes the std namespace.

With arrays taken care of by the first rule, the second rule makes sure that all the standard containers as well as all the user-defined ones that follow the standard sequence interface will work with range-based for out of the box. For example, in ODB (an ORM for C++), we have the container-like result class template which allows iteration over the query result. Because it has the standard sequence interface with a forward iterator, we didn’t have to do anything extra to make it work with range-based for.

The last rule (the fallback to the free-standing begin()and end() functions) allows us to non-invasively adapt an existing container to the range-based for loop interface.

You may be wondering why did the standard explicitly add the std namespace to ADL in the last rule? That’s a good question since the implementations provided in std simply call the corresponding member functions (which, if existed, would have satisfied the second rule). My guess is that it allows for a single place where a custom container can be adapted to the standard interface by specializing std::begin() and std::end().

The last interesting bit is the declaration (line 8). If we specified the type explicitly, then things are pretty straightforward. However, we can also let the compiler deduce the type for us, for example:

std::vector<int> v = {1, 2, 3, 5, 7, 11};
for (auto x: v)
  ...;

When automatic type deduction is used, generally, the resulting type will be the type of the *(__range.begin()) or *(begin(__range)) expression. When standard containers are used, however, the type will be const element type when __range is const and we are forming a reference and just the element type otherwise. For example:

std::vector<int> v = {1, 2, 3, 5, 7, 11};
const std::vector<int> cv = {1, 2, 3, 5, 7, 11};
 
for (auto x: v) // x is int
  ...;
 
for (auto x: cv) // x is int
  ...;
 
for (auto& x: v) // x is int&
  ...;
 
for (auto& x: cv) // x is const int&
  ...;

Another thing to note is the caching of the end iterator which makes the range-based for as efficient as what we could have written ourselves. There is, however, no provision for handling cases where the container is modified during iteration, unless iterator stability is guaranteed.

While the range-based for loop only supports straight iteration, it is easy to add support for reverse iteration with a simple adapter. In fact, it is strange that something like this is not part of the standard library:

template <typename T>
struct reverse_range
{
private:
  T& x_;
 
public:
  reverse_range (T& x): x_ (x) {}
 
  auto begin () const -> decltype (this->x_.rbegin ())
  {
    return x_.rbegin ();
  }
 
  auto end () const -> decltype (this->x_.rend ())
  {
    return x_.rend ();
  }
};
 
template <typename T>
reverse_range<T> reverse_iterate (T& x)
{
  return reverse_range<T> (x);
}
 
std::vector<int> v = {1, 2, 3, 5, 7, 11};
 
for (auto x: reverse_iterate (v))
  ...;

GCC can now be built with a C++ compiler

Tuesday, May 8th, 2012

You probably heard about the decision to allow the use of C++ in GCC itself. But it is one thing to say this and completely different to actually making a large code base like GCC to even compile with a C++ compiler instead of C. Well, GCC 4.7 got one step closer to this and can now be compiled with either a C or C++ compiler. Starting with 4.8, it is planned to build GCC in the C++ mode by default. Here is the C++ Build Status page for GCC 4.8 on various targets.

To enable the C++ mode in GCC 4.7, we use the --enable-build-with-cxx GCC configure option. As one would expect, different distributions made different decisions about how to build GCC 4.7. For example, Debian and Ubuntu use C++ while Arch Linux uses C. These differences are not visible to a typical GCC user which is why neither the GCC 4.7 release notes nor the distributions mention any of this. In fact, I didn’t know about the new C++ build mode until ODB, which is implemented as a GCC plugin, mysteriously failed to load with GCC 4.7. This “war story” is actually quite interesting so I am going to tell it below. At the end I will also discuss some implications of this change for GCC plugin development.

But first a quick recap on the GCC plugin architecture: GCC plugin is a shared object (.so) that is dynamically-loaded using the dlopen()/dlsym() API. As you may already know, with such dynamically-loaded shared objects, symbol exporting can work both ways: the executable can use symbols from the shared object and the shared object can use symbols from the executable, provided this executable was built with the -rdynamic option in order to export its symbols. This back-exporting (from executable to shared object) is quite common in GCC plugins since to do anything useful a plugin will most likely need to call some GCC functions.

Ok, so I built ODB with GCC 4.7 and tried to run it for the first time. The error I got looked like this:

 
cc1plus: error: cannot load plugin odb.so
odb.so: undefined symbol: instantiate_decl
 

Since the same code worked fine with GCC 4.5 and 4.6, my first thought was that in GCC 4.7 instantiate_decl() was removed, renamed, or made static. So I downloaded GCC source code and looked for instantiate_decl(). Nope, the function was there, the signature was unchanged, and it was still extern.

My next guess was that building GCC itself with the -rdynamic option was somehow botched in 4.7. So I grabbed Debian build logs (this is all happening on a Debian box with Debian-packaged GCC 4.7.0) and examined the configure output. Nope, -rdynamic was passed as before.

This was getting weirder and weirder. Running out of ideas, I decided to examine the list of symbols that are in fact exported by cc1plus (this is the actual C++ compiler; g++ is just a compiler driver). Note that these are not the normal symbols which we see when we run nm (and which can be stripped). These symbols come from the dynamic symbol table and we need to use the -D|--dynamic nm option to see them:

 
$ nm -D /usr/lib/gcc/x86_64-linux-gnu/4.7.0/cc1plus | 
grep instantiate_decl
0000000000529c50 T _Z16instantiate_declP9tree_nodeib
 

Wait a second. This looks a lot like a mangled C++ name. Sure enough:

 
nm -D -C /usr/lib/gcc/x86_64-linux-gnu/4.7.0/cc1plus | 
grep instantiate_decl
0000000000529c50 T instantiate_decl(tree_node*, int, bool)
 

I then ran nm without grep and saw that all the text symbols are mangled. Then it hit me: GCC is now built with a C++ compiler!

Seeing that the ODB plugin is written in C++, you may be wondering why did it still reference instantiate_decl() as a C function? Prior to 4.7, GCC headers that a plugin had to include weren’t C++-aware. As a result, I had to wrap them in the extern "C" block. Because GCC 4.7 can be built either in C or C++ mode, that extern "C" block is only necessary in the former case. Luckily, the config.h GCC plugin header defines the ENABLE_BUILD_WITH_CXX macro which we can use to decide how we should include the rest of the GCC headers:

 
#include <config.h>
 
#ifndef ENABLE_BUILD_WITH_CXX
extern "C"
{
#endif
 
...
 
#ifndef ENABLE_BUILD_WITH_CXX
} // extern "C"
#endif
 

There is also an interesting implication of this switch to the C++ mode for GCC plugin writers. In order to work with GCC 4.7, a plugin will have to be compiled with a C++ compiler even if it is written in C. Once the GCC developers actually start using C++ in the GCC source code, it won’t be possible to write a plugin in C anymore.