Friday 22 August 2014

std::initializer_list

Right-minded, decent and true C++ lovers like myself hate code like this, though we can't always exactly specify why:


void func()
{
 std::vector<int> vec;

 vec.push_back(10);
 vec.push_back(20);
 vec.push_back(30); 
}


It so happens that I can tell you exactly why I hate code like this: we're rendering declarative data using imperative syntax. At some point during execution, vec is empty. Then it has one element, then it has two, then three. This may be unavoidable in library code but library code is horrible precisely to allow business logic to be rational, readable and maintainable.

I have more than just philosophical objections to structuring data using successive lines of code. To me, code like the above has the smell of current or future bugs, especially when the structured, ordered data is ten or a hundred times the size of the presented vector. Also, wouldn't it be nice if that vector could be const?

In the past, I've written some pretty ugly code to populate my containers in a more declarative way. One option is to construct your container with reference to static const data. For example, maps can be constructed with input iterators:


#include <map>
#include <string>

typedef std::map<std::wstring, int> StringIntMap;

// declare my data
static const StringIntMap::value_type pairs[] = 
{
 StringIntMap::value_type(L"Hello", 10),
 StringIntMap::value_type(L"How", 20),
 StringIntMap::value_type(L"Are", 30),
 StringIntMap::value_type(L"You?", 40)
};

// size is handy
static const size_t numPairs = sizeof(pairs) / sizeof(pairs[0]);

void func2()
{
 // now I can construct my const map with reference to the data
 const StringIntMap map(pairs, pairs+numPairs);
}

It's hard to love this code, but it already smells a lot less to me. My map container is const and I have supplied the data in a declarative way. Errors in this code don't really require debugging; there's only one line of imperative code that can realistically be investigated, and therefore the order of insertion into the container can't be a problem.

A welcome addition in C++11 has been std::initializer_list, which seems quite simple when you first encounter it and rather stays that way. One very quick way to create a std::initializer_list is like this:


auto i = { 10, 20, 30 };

That's it. decltype(i) is std::initializer_list<int>. The C++11 standard is wonderfully clear on this, but what's gone on here? Why didn't we need to name a type? Why mess with C++ type deduction, a deep magic that already confuses so many people?

The reason, as you may have guessed from the whole context of this post, is to enable more intuitive initialization and management of aggregate objects, both user-defined and from the standard library. STL containers, templated on type T, can now be constructed and extended using std::initializer_list<T>:


void func3()
{
 auto vec = std::vector<int>{ 10, 20, 30 };

 vec.insert(std::end(vec), { 40, 50, 60 });
}

This is good, right? One thing to note is that uniform initialization means we don't have to bother with parentheses in order to invoke the vector constructor. This has lots of benefits, not least initialization that is, well, uniform (how many ways should exist to initialize objects?) and also sparing us the moans of people particularly vexed by an avoidable syntax trap. 

Good, yes, let's start using std::initializer_list for our own types! Anything we should be aware of before we crack on? Given that this is C++, the answer is of course there is!

std::initializer_list isn't an array

std::initializer_list objects provide access to an array of objects, but it can't be considered to own or manage the objects. The C++11 standard (section 18.9) suggests that it may be implemented in terms as simple as a pair of pointers. You can't append, delete, insert or alter the initialization list in any way, all of which makes sense when you consider the nature of what we're trying to do. These are values that you have hard coded into your program, so a simple and very efficient implementation of std::initializer_list would point into static const memory.

std::initializer_list has (had) very weak lifetime guarantees

Consider the implication of these two statements:

  • Copying an initializer list does not copy the underlying elements (standard, section 18.9 again)
  • The underlying array is not guaranteed to exist after the lifetime of the original initializer list object has ended (cppreference)

  • Yep. In C++11, if you copy the supplied
    std::initializer_list and keep it hanging around (as opposed to using it immediately to initialize the state of your object) then all bets are off and the underlying data could disappear immediately, never or at any time between those two time_points. I can't imagine what would drive a person to do this, but now there's a very solid reason that you shouldn't. These semantics have been updated slightly in C++2014, so check them out.


    Ok, let's take this thing for a spin!

    template <class T>
    class MyVector
    {
    public:
     explicit MyVector(std::initializer_list<T> il)
     :
     size_(il.size()),
     data_(new T[size_])
     {
      std::copy(std::begin(il), std::end(il), data_);
     }
    
     ~MyVector()
     {
      delete [] data_;
     } 
    
    private:
    
     size_t size_;
     T *data_; 
    };
    

    The usual disclaimers apply; it's for show and tell purposes only. Don't write this code, or copy and use it. It does, however, work exactly as you'd expect, so once we do something like this:


    auto v = MyVector<std::wstring>{L"Hello", L"to", L"you"};
    

    data_ is indeed a bare array of three std::wstring with the expected contents, which works for me at just about every level.

    Thursday 27 June 2013

    Limits and type predicates

    Two things packed into one post, and neither has been well publicised as a feature. I do think both will help you to develop better code.


    The limits header

    You know how you love the constants defined in float.h (<cfloat>) and limits.h (<climits>)? No? Certainly I imagine you're used to using these constants as they're critical for avoiding whole classes of overflow and underflow bug. There are several problems with these constants, not least their names, which can be confusing and hard to guess, and also the fact that if the type of your variable changes (as it so often does during development) you have to remember to update the constant. This is because there's no fundamental link between the constants and their types and absolutely nothing stops you writing code like this:
     const int wat = CHAR_MIN;  
    

    I grant you, there are situations where you might want to do this, but it's pretty horrible, right? In almost all situations, I want the limit tied to the type and this is where C++11's numeric limits can help:
     #include "stdafx.h"  
       
     #include <iostream>  
     #include <limits>  
       
     int main(int argc, char* argv[])  
     {  
         const int  intMax  = std::numeric_limits<decltype(intMax)>::max();  
         const float floatMin = std::numeric_limits<decltype(floatMin)>::min();  
       
         std::cout << "Max int is " << intMax << " and min float is " << floatMin << std::endl;  
         return 0;  
     }  
    

    (Note, these values should be constexpr rather than const, but my compiler doesn't currently support this)


    On my machine, this prints out:

    Max int is 2147483647 and min float is 1.17549e-038

    which is good. Now we can stop using those constants. Now, if I were you, I'd be scheming of evil things to do with std::numeric_limits to break it. What happens if you pass in a string? Or a class A you've just whipped up yourself?
     const std::string strMin = std::numeric_limits<decltype(strMin)>::min();  
    

    Who would do such a thing? Ah well, surely the compiler can protect you from this kind of thing? Nope, it builds for me. Let's give it a go...



    Well that went badly. So Microsoft at least haven't done anything to stop you taking mins and maxes of things that aren't numeric. Could they have?

    Type predicates

    The new <type_traits> header contains some very interesting predicates to allow you to evaluate, at build time, whether a type has a particular property. Some of the more interesting ones are:
    • has_virtual_destructor
    • is_base_of
    • is_abstract
    • is_union
    • is_polymorphic
    There are plenty more, they're quite neat. As you may have guessed from the first half of this post, there's also a predicate is_arithmetic, with which you can test if a type is, well, a floating point or an integral type. Handy. Let's try... 
     #include <iostream>  
     #include <string>  
     #include <type_traits>  
       
     int main(int argc, char* argv[])  
     {  
         const bool intIsArithmetic = std::is_arithmetic<int>();  
         const bool stringIsArithmetic = std::is_arithmetic<std::string>();  
       
         std::cout << "Int is arithmetic: " << (intIsArithmetic ? "yes" : "no") << std::endl;  
         std::cout << "String is arithmetic: " << (stringIsArithmetic ? "yes" : "no") << std::endl;  
       
         return 0;  
     }  
       
    

    On my machine (and yours too I hope), this prints out:

    Int is arithmetic: yes

    String is arithmetic: no

    With this information, we could go on and create a feature that returns min and max values only in the case where the requested type is arithmetic. The shame is, I see no way to make this work for our own, custom numeric types (think complex numbers or long long long ints). Anyone got any ideas?

    Wednesday 15 May 2013

    std::chrono

    C++11 has introduced the <chrono> header and, in turn, the std::chrono sub-namespace. This is good.

    I don't think many in the standards committee would claim this is a revolutionary addition; there are only a couple of classes that can be instantiated and a couple of free functions allowing us to cast between some of the representations of time. Nonetheless, cross-platform features for reasoning about time are extremely welcome and could go a long way to improving code in this area.

    One thing I can always get on board with is improving the semantic richness and the strength of the types that fly around an application. You've probably seen code for dealing with seconds and minutes that works something like this:
     const int MINUTES_TO_SECONDS = 60;  
       
     const int timeToSurviveMinutes = 10;  
       
     const int timeToSurviveSeconds = timeToSurviveMinutes * MINUTES_TO_SECONDS;  
    

    I don't think anybody loves code like this, but we accept it. Putting aside the overflow issues, the compiler has no comprehension of what your intent for these variables is, so you can merrily (accidentally) interchange them:
     void secondsToParsecs(int seconds)  
     {  
         // PARSECS ARE A UNIT OF DISTANCE NOT TIME  
     }  
       
     void doStuff()  
     {  
         const int MINUTES_TO_SECONDS = 60;   
         
         const int timeToSurviveMinutes = 10;   
         
         const int timeToSurviveSeconds = timeToSurviveMinutes * MINUTES_TO_SECONDS;   
       
         // bug! This function expects a value in seconds  
         secondsToParsecs(timeToSurviveMinutes);  
     }  
       

    So, now we have the tools to represent these properly! This is C++, so they're extremely generic and template-heavy, but at least you can do things like this:
     #include <chrono>  
       
     void secondsToParsecs(const std::chrono::seconds &s)  
     {  
         // ...  
     }  
       
     void doStuff()  
     {  
         // represents 15 minutes  
         const std::chrono::minutes m{15};  
        
         // convert to seconds
         const std::chrono::seconds s{std::chrono::duration_cast<std::chrono::seconds>(m)};  
       
         // bug! This time, the compiler saves us:  
         secondsToParsecs(m);  
     }  
    

    So far, we've looked at one of the three main features in the <chrono> header: durations. Durations are part of the <chrono> trifecta, which also includes time points and clocks. The three are closely related:
    • Time points are durations relative to an epoch
    • Clocks relate time points to real-world time
    So we can create simple time points using durations relative to the system epoch:
     #include <chrono>  
     #include <ctime>  
     #include <iostream>  
     int main(int argc, char **argv)  
     {  
          // get a time point zero seconds after the epoch  
          std::chrono::seconds noTime{0};  
          std::chrono::system_clock::time_point systemEpoch{noTime};  
    
          // convert to a time_t for convenient printing  
          std::time_t tt{std::chrono::system_clock::to_time_t(systemEpoch)};  
          std::cout << "System epoch is: " << ctime(&tt) << std::endl;  
          return 0;  
     }  
    

    Note that I've used std::chrono::system_clock::time_point as a shortcut for std::chrono::time_point<std::chrono::system_clock>. The former is a simple typedef for the latter.

    Finally, we can use clocks to reason about real-world time in terms of time points. The standard library provides high_resolution_clock, steady_clock and system_clock:
    • system_clock is a realtime clock
    • steady_clock is a clock that never returns a value lower than a previous return; it is monotonic. This makes it useful for calculating time intervals (timing operations)
    • high_resolution_clock is a clock with the highest resolution. The standard states that it might be a synonym for either of the aforementioned clocks and the version of the standard library I've been using, it's a typedef for system_clock.
    We combine these features to do interesting things relative to the current time. We can access the current time and use a duration to create a time point exactly ten hours from now:
     #include <chrono>  
     #include <ctime>  
     #include <iostream>  
     int main(int argc, char **argv)  
     {  
          // get the current time:  
          std::chrono::system_clock::time_point now{std::chrono::system_clock::now()};  
    
          // calculate ten hours later:  
          std::chrono::system_clock::time_point tenHoursLater{now + std::chrono::hours(10)};  
    
          // what have we got?  
          std::time_t tt{std::chrono::system_clock::to_time_t(tenHoursLater)};   
          std::cout << "Ten hours from now is: " << ctime(&tt) << std::endl;   
          return 0;  
     }  
    

    Better still, we can calculate the time a certain operation takes without resorting to platform-specific APIs:
     #include <chrono>  
     #include <ctime>  
     #include <iostream>  
       
     int main(int argc, char** argv)  
     {  
          // get the current time:  
          std::chrono::steady_clock::time_point tp1{std::chrono::steady_clock::now()};  
       
          // perform some long and boring operation  
          doLongAndBoringOperation();  
       
          // get the current time again:  
          std::chrono::steady_clock::time_point tp2{std::chrono::steady_clock::now()};  
       
          // see how long it took  
          std::chrono::duration<double> timeTaken{std::chrono::duration_cast<std::chrono::duration<double>>(tp2-tp1)};  
          std::cout << "Wow, that took " << timeTaken.count() << " seconds!" << std::endl;  
       
          return 0;  
     }  
    

    Not bad, eh?

    Friday 22 March 2013

    noexcept

    If you've been developing C++ over the last several years you'll probably be aware of exception specifications, and probably not for good reasons. The feature was variably implemented across popular compilers, ranging from "not at all implemented", through to "we'll just do what we like", which led to an unfortunate situation whereby an ANSI-specified feature could not be relied upon as portable code. So far, so disappointing.

    The other major problem with exception specifications is that they don't actually help a great deal and their behaviour could be a little... surprising. Their (official) effect was a mixture of compile-time and run-time behaviours, which conspired to not really help all that much while affecting performance more than one might expect.

    Consider this sample, which compiles without warning (well, apart from unref'd variables):


     class A {};  
     class B {};  
     class ExceptionTest  
     {  
     public:  
          void throwsB() throw(A)  
          {  
               // whoops, wrong exception spec  
               throw B();  
          }  
     };  
     int main()  
     {  
          ExceptionTest e;  
          try  
          {  
               // expecting a B  
               e.throwsB();  
          }  
          catch(B &b)  
          {  
          }  
          return 0;  
     }  
    

    So... we get no compile-time protection against throwing incorrect exceptions. What we do get is run-time checking, so when an invalid exception is thrown, unexpected() gets called, which by default results in terminate() being called. This behaviour can be customised, but that's not important here.

    There is some limited compile-time checking. For example, a function overriding a virtual function can't declare an exception specification that differs from the function it is overriding. Wow. Anyway, exception specifications are dead, so what now? What we really want is some kind of composable, logical compile-time checking that alerts the developer to bugs that would otherwise manifest at run-time. We want something analogous to const correctness.

    The C++11 offering is noexcept, and it's similar and it's different. It's similar in that what appears to be a guarantee or a promise is nothing of the sort. In fact, code that violates the noexcept specifier will compile happily and then just make a call to std::unexpected() (and probably to std::terminate()) when the exception is thrown. Hey, at least no exceptions escaped! The reason for this is discussed at length and in excellent detail in Andrzej's C++ blog here

    So, is this a rejection of static checking of exceptions? No. It sounds like the standards committee have the same aspiration as the rest of us: to get the compiler to help us by raising an error when an exception could be emitted from some function that has promised not to.

    Why does noexcept exist? Currently, it's another compiler hint. If the compiler can rely upon no exception escaping from a function (one way or another), certain optimisations become possible. In the future, maybe compiling identical code with a compiler implementing the next major standard will check your exception guarantees. noexcept has other minor advantages, for which I will again direct you to Andrzej's blog rather than reproducing his good work here.

    Wednesday 16 January 2013

    std::current_exception()

    Ok, this is a very nice idea, and also very simple. std::current_exception returns a std::exception_ptr to the exception currently being handled by a catch{} block. The following snippet shows it in action quite well, but I don't expect that this kind of thing will take off for current_exception. It might even be something of an antipattern in its current form:
     try  
     {  
      someFunction();  
     }  
     catch(...)  
     {  
      auto exPtr = std::current_exception();  
      /* now we have an exception pointer to the current exception... */  
     }  
    

    This is all well and good, but what's the point? In search of that we look at bit deeper at std::exception_ptr, which provides shared ownership semantics for exceptions in a similar fashion to shared ownership of objects provided by std::shared_ptr. Exception pointers can be created from exceptions:
     try  
     {  
      someFunction()  
     }  
     catch(std::exception &ex)  
     {  
      std::exception_ptr exPtr = std::make_exception_ptr(ex);  
      /* now we have an exception ptr... */
     }  
    

    The C++11 standard says that make_exception_ptr works roughly like this:
     template<class E> exception_ptr make_exception_ptr(E e) noexcept  
     {  
      try   
      {  
       throw e;  
      }   
      catch(...)  
      {  
       return current_exception();  
      }  
     }  
    

    It can't be the case that std::current_exception exists only to service std::make_exception_ptr, so what's all this about? The answer shouldn't surprise you: threading. Exceptions are notoriously bad at cross-thread error propagation and until C++11 there's been no simple cross-platform way of doing this. Sure, you could catch your exception, stash it somewhere and let another thread rethrow it to code that can do something about it, but the problem is that you need to manually account for all types of exception that can be thrown. This doesn't scale well and leaves you responsible for actively managing the lifetime of your exception object.

    Using
    std::current_exception() to get a  std::exception_ptr to anything that's been thrown (and remember, you can throw just about anything in C++!), you can store a container of those exception pointers in one thread and use std::rethrow_exception to propagate them again in another thread. This is possible because, although the precise definition of std::exception_ptr is unspecified, it is untemplated and therefore opaque enough to store in a homogeneous container. As soon as your exception is no longer referenced by at least one std::exception_ptr, its lifetime ends and the resources are freed up. 

    What could be simpler?

    The "transporting exception across the thread boundary" pattern is implemented in this example from Microsoft.

    Monday 14 January 2013

    std::to_string()

    Starting simple, I was really pleased and a little surprised to see to_string (and its sibling, to_wstring) pop up. It's a feature that we've really needed for more time than I care to consider and it removes the need for the fairly foul: 

     char buff[12] = {0};  
     sprintf(buff, "%d", INT_MIN);  
     /* convert to std::string if we like */  
    

    or the moderately foul:


     std::ostringstream mystream;  
     mystream << INT_MIN;  
     std::string s = mystream.str();  
    

    I'm not saying there aren't proper uses for sprintf or stringstream, it's just that the above seems such a waste of time when you consider Python's str() or even C#'s versatile Convert.ToString(). Neither of my examples are very expressive and the first gets even worse if you're developing for a Microsoft platform and you have to start considering sprintf_s and the like.

    std::to_string is long overdue and extremely simple to use:


     float f = 3.14;  
     std::string piString = std::to_string(f);  
    

    What could be easier than that? No worrying about buffer sizes, no baroque format string incantations, no stringstream wastes. That ends the first real post. If anyone posts comments, I promise to read them. Is anybody out there?

    Hi!

    I'm a big fan of the new(ish) C++11 standard. It's not that I think it patches the problems with previous C++ standards: it's still more than possible to leak memory, construct truly horrible deadlock scenarios and accidentally take a copy of vast amounts of data. The difference is that it enables and encourages a way of writing C++ code that dramatically reduces the likelihood of you encountering the above situations. Please don't believe me, there's lots written on this subject by people far more intelligent than me and I hope you have the time to check it out.

    I was inspired to have a go at writing this technical blog by the sheer number of surprising things that crop up when I'm watching the experts discuss the new headline C++11 features. I'm talking mostly about the types added to the standard library, some of which are incredibly feature rich and expressive and have received almost no coverage online. Also, there is a distinct paucity of examples that I might be able to begin to address here.

    So that's it. I'm going to give it a go. At some point, sooner or later, I will run out of things that surprise me in C++11 and then I'll just stop. Until then, I hope these posts are of at least some small utility to some of you.