Efficient argument passing in C++11, Part 2
Tuesday, June 26th, 2012Last week, in Part 1 of this post, we discussed various ways to achieve efficient argument passing in C++11. As you may remember, none of them offered a universal, fit-all solutions. I also tried to pay special attention to some of the areas that cause extra confusion. But, alas, confusion was abound regardless (or maybe because; who knows) of my attempts. I am also not sure if some individuals are truly confused or if they have “bought in” to a specific approach and are now exhibiting foolish consistency, which, as Emerson famously put it, is the hobgoblin of little minds.
In any case, let me try to re-state the problem in a slightly different light and as concisely as I can. In C++11 there are three ways to pass an “in” argument to a function, each of them works better in some cases than others. These are: pass by const
lvalue reference, pass by value, as well as the overload on const
lvalue and rvalue references. Here are their respective signatures:
void f (const std::string&); // const reference void f (std::string); // value void f (const std::string&); // const reference and void f (std::string&&); // rvalue reference
The const
reference approach is efficient if we don’t make a copy of the passed argument. However, if we do, and the function is called with an rvalue, then we miss the opportunity of moving this argument instead of making a copy. So, in summary, pass by const
reference is optimal if no copies are made. Otherwise, it misses out on rvalue arguments.
The by-value approach is efficient if we do make a copy of the passed argument. However, if we don’t, and the function is called with an lvalue, then we make an unnecessary copy. So, in summary, pass by value is optimal if we know for sure we are going to copy the argument. Otherwise, it adds a copy overhead in case of an lvalue argument.
If we don’t know whether we will be making a copy of the argument, then neither approach gives us a satisfactory solution. And, as we have seen in Part 1, there are quite a few legitimate cases where we don’t.
The last approach (lvalue/rvalue overload) doesn’t have any of these problems. However, its biggest issue is impracticality in the face of a large number of arguments; it requires two function overloads to handle each of them.
At the end of last week’s post we also discussed briefly what would be an ideal solution to this problem. It seems what we need is a type that binds to lvalues (as a const
reference) and rvalues (as an rvalue reference) and allows us to determine which one of the two it is. We also concluded that unfortunately there is no built-in type like that in C++11.
As you may remember, I also drew an analogy with perfect forwarding which solves exactly the same problem (passing both rvalues and lvalues in a single argument), but at compile time. Interestingly, as I was reading through the Proposal to Add an Rvalue Reference to the C++ Language (N1690), I realized that it not only provides a similar functionality, but the original motivation was exactly the same! Here is a relevant quote:
“One way to accomplish this[(forwarding)] is by overloading on the free parameter with both const and non-const lvalue references. […] However, as the number of free parameters grows, this solution quickly grows impractical. The number of overloads required increases exponentially with the number of parameters (2^N where N is the number of parameters). This proposal provides perfect forwarding using only one overload, no matter how many free parameters exist.”
So they solved it for the standard library developers (that’s where perfect forwarding will most often be used) but not for the application developers. Oh well, that’s life. To be fair, this is as much our (i.e., application developers) fault since we only start using new features once they become standardized. And once they are standardized, it is too late to complain.
If there is no built-in support for what we need, then can we create our own solution? Let’s try to arrive at the answer together. We will use the overloaded functions for rvalue and lvalue references approach as the starting point since it doesn’t have any technical problems. It does what we want, which is to distinguish between lvalue and rvalue references inside the function body. Its only drawback is that we have to have two separate function bodies for each argument. So what would be great is a way to pass lvalues and rvalues to the same function (that we already can do with a const
reference) and be able to distinguish between the two (which is what we cannot do with a const
reference).
So what we need is a type that can be initialized either with lvalue or rvalue references and that we can later query to find out which one it is. The standard defines the std::reference_wrapper
class template. Unfortunately, it doesn’t have all the functionality that we need — it is limited to lvalue references. But we can take its cue and create our own wrapper that can store either an rvalue or const
lvalue reference. Because its functionality is quite specific to argument passing, let’s call it in
(as in “in” parameter) instead of something more generic, like lr_reference_wrapper
. While in
is a very short name with plenty of opportunities for clashes, it also has the potential of being used throughout the application. By making it short we are trying to keep the code as concise as possible. Also, the proper place for something this fundamental is probably the std
namespace, so we would have std::in
instead of just in
. Here is my take on this class template:
#include <type_traits> template <typename T> struct in { in (const T& l): lv_ (&l), rv_ (0) {} in (T&& r): lv_ (0), rv_ (&r) {} // Accessors. // bool lvalue () const {return lv_ != 0;} bool rvalue () const {return rv_ != 0;} operator const T& () const {return get ();} const T& get () const {return lv_ ? *lv_ : *rv_;} T&& rget () const {return *rv_;} // Move. Returns a copy if lvalue. // T move () const {return lv_ ? *lv_ : std::move (*rv_);} // Support for implicit conversion via perfect forwarding. // typedef std::aligned_storage<sizeof (T), alignof (T)> storage; template <typename T1, typename std::enable_if< std::is_convertible<T1, T>::value, int>::type = 0> in (T1&& x, storage s = storage ()) : lv_ (0), rv_ (new (&s) T (x)) {} in (T& l): lv_ (&l), rv_ (0) {} // For T1&& becoming T1&. private: const T* lv_; T* rv_; };
Most of the above code should be self-explanatory, except, maybe, for the part implementing support for implicit conversion. To understand what’s going on there and why it is necessary, let’s assume we didn’t have those last two constructors. Now consider this code fragment as an example:
void f (in<std::string>); std::string s ("foo"); f (s); // Ok, argument is lvalue. f (std::string ("bar")) // Ok, argument is rvalue. f ("baz"); // Error.
Without the implicit conversion support, the last call fails since there is no way to covert a C-string to in<std::string>
. This is because the in
class template itself relies on the implicit conversion and C++ doesn’t do implicit conversion chains (i.e., "baz"
-> std::string
-> in<std::string>
) when trying to pass an argument.
The implicit conversion support uses perfect forwarding and has a few tricky areas that need explaining. The first thing to note is the use of std::enable_if
to only enable the implicit conversion if the underlying type supports it. Without this restriction our in
class template will be happy to convert from anything to anything, which will mess up overload resolution at the function level.
The second tricky area is the construction of a temporary that is the result of the implicit conversion. There are two straightforward ways to implement this: either allocate the temporary dynamically or make it a member of the class. Both of these approaches have major drawbacks. The dynamic allocation approach requires, well, dynamic allocation while the member approach occupies the stack space regardless of whether we actually need to do an implicit conversion or not, and in most cases we probably won’t need to. Instead, the above implementation allocates suitably aligned storage for a temporary as a second argument to the implicit conversion constructor. The lifetime of this storage is guaranteed until the end of the full expression (i.e., until ;
in most cases) which is sufficient for our needs.
Let’s now see how we would use this new facility to implement the two versions of our email
constructor from last week:
email (in<std::string> first, in<std::string> last, in<std::string> addr) : first_ (first.move ()), last_ (last.move (), addr_ (addr.move ()) { }
email (in<std::string> first, in<std::string> last, in<std::string> addr) : email_ (first.move ()) { email_ += ' '; email_ += last; email_ += " <"; email_ += addr; email_ += '>'; }
A slightly more interesting example is the reimplementation of operator+
for the matrix
class using this approach:
matrix operator+ (in<matrix> x, in<matrix> y) { matrix r (x.rvalue () ? x.move () : y.move ()); r += (x.rvalue () ? y : x); return r; }
Here is a slightly more complicated implementation that saves a move constructor call by using the rvalue directly:
matrix operator+ (in<matrix> x, in<matrix> y) { matrix&& x1 = x.rvalue () ? x.rget () : y.rvalue () ? y.rget () : matrix (x); const matrix& y1 = x.rvalue () ? y : y.rvalue () ? x : y; x1 += y1; return std::move (x1); }
While this approach solves all the problems of the other three methods, it also has some of its own. The biggest issue is conceptual rather than technical: this approach is not transparent; we have to use a non-core language mechanism for something as fundamental as efficiently passing values to functions. Though this can probably be overcome if something like this ends up in the standard and its use becomes idiomatic.
While callers of a function that uses the in
class template don’t need to do anything special, inside the function things are not as pretty. Because the actual value is now wrapped, we cannot access its member functions directly. Instead, we first have to explicitly “unwrap” it, for example:
void f (in<std::string> s) { if (!s.get ().empty ()) { ... } }
One way to somewhat rectify this situation would be to provide operator->
, even though in
is not really a pointer.
On the technical side, this approach has surprisingly few issues, at least as far as I can see (if you spot others, do share them in the comments below). The only potentially serious issue is the possible ambiguity if a second overload has an argument type that is implicit-constructible from the first overload argument type. Here is an example:
struct my_string { my_string (const std::string&); ... }; void f (in<std::string>); void f (in<my_string>); std::string s ("foo"); f (s); // Error.
Here we have a problem because both in<std::string>
and in<my_string>
can be implicit-constructed from std::string
. One way to resolve this would be to add a list of excluded implicit conversions to the in
class template:
void f (in<std::string>); void f (in<my_string, std::string>); // std::string is excluded // from implicit conversions.
Not very elegant, I know, but, as you might have noticed, nothing about this topic appears terribly elegant.
So, there you have it. An inelegant solution that nevertheless seems to do the trick. Do I really suggest that we start using it in our applications? Well, for the answer you will have to wait until Part 3 of this post next week where we will try to come up with some sort of guidelines on which approach to use when. In the meantime, tell us what you think. You can do it in the comments below or in the /r/cpp discussion of this post.