C++11 range-based for loop

C++11 range-based `for` loop

On the surface, the new range-based for loop may seem like a simple feature, perhaps the simplest of all the core language changes in C++11. However, like with most higher-level abstractions, there are quite a few nuances once we start digging a little bit deeper. So in this post I am going to dig a little bit deeper with the intent to get a better understanding of this feature as well as the contexts in which it can and cannot be used.

The range-based for loop has the following form:

for ( declaration : expression ) statement

According to the standard, this is equivalent to the the following plain for loop:

1   {
2     auto&& __range = expression;
3     for (auto __begin = begin-expression,
4               __end = end-expression;
5          __begin != __end;
6          ++__begin)
7     {
8       declaration = *__begin;
9       statement
10    }
11  }

Note that when the standard says equivalent, it means that the resulting logic is equivalent and not that this is the actual translation; in particular, the variable names (e.g., __range, __begin, etc.) are for exposition only and cannot be referred to by the application.

Ok, the equivalent plain for loop version looks quite a bit more complicated compared to the range-based one. Let’s start our examination with the __range initialization (line 2). We use automatic type deduction to determine the type of the range variable based on the initializing expression. Note also that the resulting variable is made an r-value reference. This is done to allow us to iterate over temporaries without making any copies and without imposing additional const restrictions. To see where the use of the r-value reference becomes important, consider this example:

std::vector<int> f ();
 
for (int& x: f ())
  x = 0;

What can we have for the expression? Well, it can be a standard container, an array, a brace initializer list (in which case __range will be std::initializer_list), or anything that supports the concept of iteration by providing suitable begin() and end() functions. Here are a few examples:

int primes[] = {1, 2, 3, 5, 7, 11};
 
for (int x: primes)
  ...;
 
for (int x: {1, 2, 3, 5, 7, 11})
  ...;
 
template <typename T>
struct istream_range
{
  typedef std::istream_iterator<T> iterator_type;
 
  istream_range (std::istream& is): is_ (is) {}
 
  iterator_type begin () const
  {
    return iterator_type (is_);
  }
 
  iterator_type end () const
  {
    return iterator_type ();
  }
 
private:
  std::istream& is_;
};
 
for (int x: istream_range<int> (cin))
  ...;

The begin-expression and end-expression (lines 3 and 4) are determined as follows:

If expression is an array, then begin-expression and end-expression are __range and __range + __bound, respectively, where __bound is the array bound.
If expression is of a class type that declares begin() and end() member functions, then begin-expression and end-expression are __range.begin() and __range.end(), respectively.
Otherwise, begin-expression and end-expression are begin(__range) and end(__range), respectively, where the begin() and end() functions are looked up using the argument-dependent lookup (ADL) which also includes the std namespace.

With arrays taken care of by the first rule, the second rule makes sure that all the standard containers as well as all the user-defined ones that follow the standard sequence interface will work with range-based for out of the box. For example, in ODB (an ORM for C++), we have the container-like result class template which allows iteration over the query result. Because it has the standard sequence interface with a forward iterator, we didn’t have to do anything extra to make it work with range-based for.

The last rule (the fallback to the free-standing begin()and end() functions) allows us to non-invasively adapt an existing container to the range-based for loop interface.

You may be wondering why did the standard explicitly add the std namespace to ADL in the last rule? That’s a good question since the implementations provided in std simply call the corresponding member functions (which, if existed, would have satisfied the second rule). My guess is that it allows for a single place where a custom container can be adapted to the standard interface by specializing std::begin() and std::end().

The last interesting bit is the declaration (line 8). If we specified the type explicitly, then things are pretty straightforward. However, we can also let the compiler deduce the type for us, for example:

std::vector<int> v = {1, 2, 3, 5, 7, 11};
for (auto x: v)
  ...;

When automatic type deduction is used, generally, the resulting type will be the type of the *(__range.begin()) or *(begin(__range)) expression. When standard containers are used, however, the type will be const element type when __range is const and we are forming a reference and just the element type otherwise. For example:

std::vector<int> v = {1, 2, 3, 5, 7, 11};
const std::vector<int> cv = {1, 2, 3, 5, 7, 11};
 
for (auto x: v) // x is int
  ...;
 
for (auto x: cv) // x is int
  ...;
 
for (auto& x: v) // x is int&
  ...;
 
for (auto& x: cv) // x is const int&
  ...;

Another thing to note is the caching of the end iterator which makes the range-based for as efficient as what we could have written ourselves. There is, however, no provision for handling cases where the container is modified during iteration, unless iterator stability is guaranteed.

While the range-based for loop only supports straight iteration, it is easy to add support for reverse iteration with a simple adapter. In fact, it is strange that something like this is not part of the standard library:

template <typename T>
struct reverse_range
{
private:
  T& x_;
 
public:
  reverse_range (T& x): x_ (x) {}
 
  auto begin () const -> decltype (this->x_.rbegin ())
  {
    return x_.rbegin ();
  }
 
  auto end () const -> decltype (this->x_.rend ())
  {
    return x_.rend ();
  }
};
 
template <typename T>
reverse_range<T> reverse_iterate (T& x)
{
  return reverse_range<T> (x);
}
 
std::vector<int> v = {1, 2, 3, 5, 7, 11};
 
for (auto x: reverse_iterate (v))
  ...;

This entry was posted on Wednesday, May 16th, 2012 at 7:17 am and is filed under C++. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.