C++11 range-based for
loop
On the surface, the new range-based for
loop may seem like a simple feature, perhaps the simplest of all the core language changes in C++11. However, like with most higher-level abstractions, there are quite a few nuances once we start digging a little bit deeper. So in this post I am going to dig a little bit deeper with the intent to get a better understanding of this feature as well as the contexts in which it can and cannot be used.
The range-based for
loop has the following form:
for ( declaration : expression ) statement
According to the standard, this is equivalent to the the following plain for
loop:
1 { 2 auto&& __range = expression; 3 for (auto __begin = begin-expression, 4 __end = end-expression; 5 __begin != __end; 6 ++__begin) 7 { 8 declaration = *__begin; 9 statement 10 } 11 }
Note that when the standard says equivalent, it means that the resulting logic is equivalent and not that this is the actual translation; in particular, the variable names (e.g., __range
, __begin
, etc.) are for exposition only and cannot be referred to by the application.
Ok, the equivalent plain for
loop version looks quite a bit more complicated compared to the range-based one. Let’s start our examination with the __range
initialization (line 2). We use automatic type deduction to determine the type of the range variable based on the initializing expression. Note also that the resulting variable is made an r-value reference. This is done to allow us to iterate over temporaries without making any copies and without imposing additional const
restrictions. To see where the use of the r-value reference becomes important, consider this example:
std::vector<int> f (); for (int& x: f ()) x = 0;
What can we have for the expression? Well, it can be a standard container, an array, a brace initializer list (in which case __range
will be std::initializer_list
), or anything that supports the concept of iteration by providing suitable begin()
and end()
functions. Here are a few examples:
int primes[] = {1, 2, 3, 5, 7, 11}; for (int x: primes) ...; for (int x: {1, 2, 3, 5, 7, 11}) ...; template <typename T> struct istream_range { typedef std::istream_iterator<T> iterator_type; istream_range (std::istream& is): is_ (is) {} iterator_type begin () const { return iterator_type (is_); } iterator_type end () const { return iterator_type (); } private: std::istream& is_; }; for (int x: istream_range<int> (cin)) ...;
The begin-expression and end-expression (lines 3 and 4) are determined as follows:
- If expression is an array, then begin-expression and end-expression are
__range
and__range + __bound
, respectively, where__bound
is the array bound. - If expression is of a class type that declares
begin()
andend()
member functions, then begin-expression and end-expression are__range.begin()
and__range.end()
, respectively. - Otherwise, begin-expression and end-expression are
begin(__range)
andend(__range)
, respectively, where thebegin()
andend()
functions are looked up using the argument-dependent lookup (ADL) which also includes thestd
namespace.
With arrays taken care of by the first rule, the second rule makes sure that all the standard containers as well as all the user-defined ones that follow the standard sequence interface will work with range-based for
out of the box. For example, in ODB (an ORM for C++), we have the container-like result
class template which allows iteration over the query result. Because it has the standard sequence interface with a forward iterator, we didn’t have to do anything extra to make it work with range-based for
.
The last rule (the fallback to the free-standing begin()
and end()
functions) allows us to non-invasively adapt an existing container to the range-based for
loop interface.
You may be wondering why did the standard explicitly add the std
namespace to ADL in the last rule? That’s a good question since the implementations provided in std
simply call the corresponding member functions (which, if existed, would have satisfied the second rule). My guess is that it allows for a single place where a custom container can be adapted to the standard interface by specializing std::begin()
and std::end()
.
The last interesting bit is the declaration (line 8). If we specified the type explicitly, then things are pretty straightforward. However, we can also let the compiler deduce the type for us, for example:
std::vector<int> v = {1, 2, 3, 5, 7, 11}; for (auto x: v) ...;
When automatic type deduction is used, generally, the resulting type will be the type of the *(__range.begin())
or *(begin(__range))
expression. When standard containers are used, however, the type will be const
element type when __range
is const
and we are forming a reference and just the element type otherwise. For example:
std::vector<int> v = {1, 2, 3, 5, 7, 11}; const std::vector<int> cv = {1, 2, 3, 5, 7, 11}; for (auto x: v) // x is int ...; for (auto x: cv) // x is int ...; for (auto& x: v) // x is int& ...; for (auto& x: cv) // x is const int& ...;
Another thing to note is the caching of the end iterator which makes the range-based for
as efficient as what we could have written ourselves. There is, however, no provision for handling cases where the container is modified during iteration, unless iterator stability is guaranteed.
While the range-based for
loop only supports straight iteration, it is easy to add support for reverse iteration with a simple adapter. In fact, it is strange that something like this is not part of the standard library:
template <typename T> struct reverse_range { private: T& x_; public: reverse_range (T& x): x_ (x) {} auto begin () const -> decltype (this->x_.rbegin ()) { return x_.rbegin (); } auto end () const -> decltype (this->x_.rend ()) { return x_.rend (); } }; template <typename T> reverse_range<T> reverse_iterate (T& x) { return reverse_range<T> (x); } std::vector<int> v = {1, 2, 3, 5, 7, 11}; for (auto x: reverse_iterate (v)) ...;