Two typical multi-threaded programming mistakes

A couple of days ago I had to fix two threading-related mistakes that I thought were very typical of people who are new to the “threaded way of thinking”. Here is the simplified version of that code. Don’t read too much into whether this is the best way to implement a string pool:

class string_pool
{
public:
  typedef std::map<std::string, std::size_t> map;
 
  string_pool (const map& ro_map)
    : ro_map_ (ro_map)
  {
  }
 
  std::size_t
  find_or_add (const std::string& s)
  {
    // First check the read-only map.
    //
    map::const_iterator i = ro_map_.find (s);
 
    if (i != ro_map_.end ())
      return i->second_;
 
    // Check the writable map.
    //
    i = map_.find (s);
 
    if (i != map_.end ())
      return i->second;
    else
    {
      // We are about to modify the map so we have to
      // synchronize this part.
      //
      auto_lock l (mutex_);
      std::size_t id = ro_map_.size () + map_.size () + 1;
      map_.insert (std::make_pair (s, id));
    }
  }
 
private:
  map map_;
  const map& ro_map_;
  mutex mutex_;
};

Can you spot the two problems with the above code? The first one is related to the misconception that the sole purpose of mutexes and similar primitives is to make sure that no two threads try to modify the same data simultaneously. This leads to the incorrect idea that we only need to synchronize the parts of code that modify the data while the parts that only read it can do so freely. On modern CPU architectures mutexes play another important role: they make sure that the changes made by one thread are actually visible to others and visible in the order they were made.

In the above code the problematic place is the unsynchronized check for the existence of the string in the writable map. What are some undesirable things that can happen here? For example, one thread could add a string while others won’t “see” this change and will all try to re-add the same string. Or, worse yet, a thread could only see part of the change. For example, the map_.insert() implementation could allocate a new, uninitialized map entry, update all the house-keeping information, and then initialize it with the passed pair. Imagine what will happen if the reading threads only see the result of the first part of this operation.

The other problem is more subtle. Compare these two code fragments:

  std::size_t id = ro_map_.size () + map_.size () + 1;
  std::size_t n = ro_map_.size ();
  std::size_t id = n + map_.size () + 1;

There is little semantic difference between the two. Now consider what happens when we add synchronization:

  auto_lock l (mutex_);
  std::size_t id = ro_map_.size () + map_.size () + 1;
  std::size_t n = ro_map_.size ();
  auto_lock l (mutex_);
  std::size_t id = n + map_.size () + 1;

Do you see the difference now? In the first fragment we call ro_map_.size(), which doesn’t need any synchronization, while having the mutex locked. This can result in higher contention between threads since we are holding the mutex for longer. While the call to size() here is probably negligible (the actual code I had to fix was calling virtual functions), it underlines a general principle: you should try to execute all the code that doesn’t need synchronization before acquiring the mutex.

Here is how we can fix these two problems in the above code:

  std::size_t
  find_or_add (const std::string& s)
  {
    // First check the read-only map.
    //
    map::const_iterator i = ro_map_.find (s);
 
    if (i != ro_map_.end ())
      return i->second_;
 
    // Perform operations that can be done in
    // parallel with other threads.
    //
    std::size_t n = ro_map_.size ();
    map::value_type pair (s, n + 1);
 
    // We are about to access the writable map so
    // synchronize this part.
    //
    auto_lock l (mutex_);
    i = map_.find (s);
 
    if (i != map_.end ())
      return i->second;
    else
    {
      pair.second += map_.size ();
      map_.insert (pair);
    }
  }

Comments are closed.