C++11 Surprises: August 2014

Right-minded, decent and true C++ lovers like myself hate code like this, though we can't always exactly specify why:

void func()
{
 std::vector<int> vec;

 vec.push_back(10);
 vec.push_back(20);
 vec.push_back(30); 
}

It so happens that I can tell you exactly why I hate code like this: we're rendering declarative data using imperative syntax. At some point during execution, vec is empty. Then it has one element, then it has two, then three. This may be unavoidable in library code but library code is horrible precisely to allow business logic to be rational, readable and maintainable.

I have more than just philosophical objections to structuring data using successive lines of code. To me, code like the above has the smell of current or future bugs, especially when the structured, ordered data is ten or a hundred times the size of the presented vector. Also, wouldn't it be nice if that vector could be const?

In the past, I've written some pretty ugly code to populate my containers in a more declarative way. One option is to construct your container with reference to static const data. For example, maps can be constructed with input iterators:

#include <map>
#include <string>

typedef std::map<std::wstring, int> StringIntMap;

// declare my data
static const StringIntMap::value_type pairs[] = 
{
 StringIntMap::value_type(L"Hello", 10),
 StringIntMap::value_type(L"How", 20),
 StringIntMap::value_type(L"Are", 30),
 StringIntMap::value_type(L"You?", 40)
};

// size is handy
static const size_t numPairs = sizeof(pairs) / sizeof(pairs[0]);

void func2()
{
 // now I can construct my const map with reference to the data
 const StringIntMap map(pairs, pairs+numPairs);
}

It's hard to love this code, but it already smells a lot less to me. My map container is const and I have supplied the data in a declarative way. Errors in this code don't really require debugging; there's only one line of imperative code that can realistically be investigated, and therefore the order of insertion into the container can't be a problem.

A welcome addition in C++11 has been std::initializer_list, which seems quite simple when you first encounter it and rather stays that way. One very quick way to create a std::initializer_list is like this:

auto i = { 10, 20, 30 };

That's it. decltype(i) is std::initializer_list<int>. The C++11 standard is wonderfully clear on this, but what's gone on here? Why didn't we need to name a type? Why mess with C++ type deduction, a deep magic that already confuses so many people?

The reason, as you may have guessed from the whole context of this post, is to enable more intuitive initialization and management of aggregate objects, both user-defined and from the standard library. STL containers, templated on type T, can now be constructed and extended using std::initializer_list<T>:

void func3()
{
 auto vec = std::vector<int>{ 10, 20, 30 };

 vec.insert(std::end(vec), { 40, 50, 60 });
}

This is good, right? One thing to note is that uniform initialization means we don't have to bother with parentheses in order to invoke the vector constructor. This has lots of benefits, not least initialization that is, well, uniform (how many ways should exist to initialize objects?) and also sparing us the moans of people particularly vexed by an avoidable syntax trap.

Good, yes, let's start using std::initializer_list for our own types! Anything we should be aware of before we crack on? Given that this is C++, the answer is of course there is!

std::initializer_list isn't an array

std::initializer_list objects provide access to an array of objects, but it can't be considered to own or manage the objects. The C++11 standard (section 18.9) suggests that it may be implemented in terms as simple as a pair of pointers. You can't append, delete, insert or alter the initialization list in any way, all of which makes sense when you consider the nature of what we're trying to do. These are values that you have hard coded into your program, so a simple and very efficient implementation of std::initializer_list would point into static const memory.

std::initializer_list has (had) very weak lifetime guarantees

Consider the implication of these two statements:

Copying an initializer list does not copy the underlying elements (standard, section 18.9 again)

The underlying array is not guaranteed to exist after the lifetime of the original initializer list object has ended (cppreference)

Yep. In C++11, if you copy the supplied std::initializer_list and keep it hanging around (as opposed to using it immediately to initialize the state of your object) then all bets are off and the underlying data could disappear immediately, never or at any time between those two time_points. I can't imagine what would drive a person to do this, but now there's a very solid reason that you shouldn't. These semantics have been updated slightly in C++2014, so check them out.

Ok, let's take this thing for a spin!

template <class T>
class MyVector
{
public:
 explicit MyVector(std::initializer_list<T> il)
 :
 size_(il.size()),
 data_(new T[size_])
 {
  std::copy(std::begin(il), std::end(il), data_);
 }

 ~MyVector()
 {
  delete [] data_;
 } 

private:

 size_t size_;
 T *data_; 
};

The usual disclaimers apply; it's for show and tell purposes only. Don't write this code, or copy and use it. It does, however, work exactly as you'd expect, so once we do something like this:

auto v = MyVector<std::wstring>{L"Hello", L"to", L"you"};

data_ is indeed a bare array of three std::wstring with the expected contents, which works for me at just about every level.

Friday 22 August 2014

std::initializer_list