Archive for the ‘VC++’ Category

Visual Studio 2012 First Impressions

Tuesday, October 9th, 2012

A few weeks ago we released ODB 2.1.0. Besides a large number of new features, this version also added support for Visual Studio 2012. Specifically, all the runtime libraries, examples, and tests now come with project/solution files for and were successfully tested with Visual Studio 2012 in addition to 2010 and 2008. This blog post is a collection of notes on differences between Visual Studio 2010 and 2012 that we encountered while working on adding this support. The notes primarily focus on C++ and you may find them useful if you plan to migrate to 2012 or if you want to add support for this version of Visual Studio in your project.

The first thing that you will notice when installing Visual Studio 2012 (VS2012 from now on) is that you can no longer install just the C++ development environment. Now all the languages are always installed. So make sure you have enough free space available. VS2012 co-exists fine with VS2010 and VS2008 provided you install SP1 for VS2010. Failed that you will get a weird error from the linker saying that the COFF file is corrupt. To test ODB we have all three versions of VS installed on the same virtual machine and everything works fine.

After installing VS2012 I was preparing myself mentally for the mind-numbing task of setting up all the include/library search paths in VC++ Directories. In my case those are needed for all the 5 databases that ODB supports, all the ODB runtime libraries, plus Boost and Qt. Multiply that by two for 32 and 64-bit configurations, and you end up with a lot of directories. BTW, getting to the VC++ Directories dialog in VS2012 is exactly the same as in VS2010. That is, open the Property Manager tab, then open Microsoft.Cpp.Win32.User or Microsoft.Cpp.x64.User sheet.

Imagine my surprise when I opened this dialog for the first time and saw all the directories pre-populated! For a moment I thought, finally, Microsoft actually did something sensible for a change. However, my joy was short lived. At first I thought that VS2012 simply copied the directories from VS2010 and all I needed to do is just tweak them a bit. But then I realized that Microsoft could have also done something else: they could have shared the entries with VS2010! So I quickly modified one entry in VS2012 and sure enough I saw the same modification appearing in VS2010.

Why is this bad news? We all know that mixing libraries built using one version of VS with applications that use another is asking for trouble. In fact, some libraries, such as Qt, go a step further and actively prevent you from doing this by prefixing their symbols with the VS version. The fact that you are now forced to share VC++ Directories between VS2010 and 2012 just shows that Microsoft still doesn’t understand how developers are using their product.

Luckily, there is a fairly easy workaround for this problem. The idea is to include the VS version number into the path and then use the $(VisualStudioVersion) variable in the VC++ Directories entries. Here is an example for the ODB runtime library, libodb. First step is to create two directories with the library source code, say libodb-vc10.0 and libodb-vc11.0. Then build libodb-vc10.0 with VS2010 and libodb-vc11.0 with VS2012. Once this is done, the final step is to open the VC++ Directories dialog (doesn’t matter whether it is VS2010 or 2012) and add the following paths:

Include: ...\libodb-vc$(VisualStudioVersion)
Library: ...\libodb-vc$(VisualStudioVersion)\lib

In ODB, VS project and solution files are automatically generated from templates. So for us the first step in adding support for VS2012 was to come up with suitable library and executable project templates. As it turned out, it was actually easier to convert VS2010 projects to VS2012 rather than create ones from scratch. Comparing the 2010 and 2012 project files (.vcxproj) revealed that the only difference between the two is the addition of the following PlatformToolset tag after UseDebugLibraries in each configuration:


Similar to the project files, the VS2012 solution files (.sln) are exactly the same except for the VS version embedded in them. The project filters files (.vcxproj.filters) haven’t changed. So all that was needed to convert VS2010 project/solution templates to VS2012 was a couple of simple sed scripts. Overall, in case of ODB, it took about half a day to create all the templates and update all the scripts. And another day to build and test all the configurations for all the databases.

I also haven’t encountered any spurious errors or warnings when compiling ODB with VS2012 compared to 2010. Though ODB source code was already fairly clean thanks to being tested with the latest versions of GCC and Clang with all the warnings enabled.

Compilation speed-wise, I haven’t noticed any significant improvements. While I haven’t done any thorough testing in this area, SQLite build times show a marginal improvement (39s for VS2012 vs 43s for 2010).

Finally, when it comes to C++11 support in VS2012, it appears Microsoft concentrated on user-visible features (like range-based for-loop) at the expense of library-level ones. As a result, the list of unsupported C++11 features that could be useful in ODB is exactly the same for VS2012 as for 2010:

#if _MSC_VER >= 1600
#  define ODB_CXX11
#  define ODB_CXX11_NULLPTR

Do we need std::buffer?

Tuesday, August 9th, 2011

Or, boost::buffer for starters?

A few days ago I was again wishing that there was a standard memory buffer abstraction in C++. I have already had to invent my own classes for XSD and XSD/e (XML Schema to C++ compilers) where they are used for mapping the XML Schema hexBinary and base64Binary types to C++. Now I have the same problem in ODB (an ORM system for C++) where I need a suitable C++ type for representing database BLOB types. This time I have decided against creating another copy of my own buffer class and instead use the poor man’s “standard” buffer, std::vector<char>, with its unnatural interface and all.

The abstraction I am wishing for is a simple class for encapsulating the memory management of a raw memory buffer plus providing a few common operations, such as memcpy, memset, etc. So instead of writing this:

class person
  person (char* key_data, std::size_t key_size)
    : key_size_ (key_size)
    key_data_ = new char[key_size];
    std::memcpy (key_data_, key_data, key_size);
  ~person ()
    delete key_data_;
  char* key_data_;
  std::size_t key_size_;

Or having to create yet another custom buffer class, we could do this:

class person
  person (char* key_data, std::size_t key_size)
    : key_ (key_data, key_size)
  std::buffer key_;

Above I called vector<char> a poor man’s “standard” buffer. But what exactly is wrong with using it to manage a memory buffer? While it works reasonably well functionally, the interface is unnatural and some operations may not be as efficient as we would expect from a memory buffer. Let’s examine the most prominent examples of these issues.

The first problem is with how we access the underlying memory. The C++ standard defect report (DR) 464 added the data() member function to std::vector which returns a pointer to the buffer. However, there are still compilers in use that do not support this, notably GCC 3.4 and VC++ 2008/9.0. As a result, if you want your code to be portable, you will need to use the much less intuitive &b.front() expression:

vector<char> b = ...
memcpy (out, &b.front (), b.size ());

There is also a subtle issue with using front(). While it appears to be legal to call data() on an empty buffer (as long as we don’t dereference the returned pointer), it is illegal to call front(). This means that you may have to handle an empty buffer as a special case, further complicating your code:

vector<char> b = ...
memcpy (out, (b.empty () ? 0 : &b.front ()), b.size ());

The initialization of a buffer is also inconvenient and potentially inefficient. Let’s say we want to have an uninitialized buffer of 1024 bytes which we plan to fill in later. There is no way to do that with vector<char>. The best we can do is to have every byte initialized:

vector<char> b (1024); // Zero-initialized buffer.

If we want to create a buffer initialized with contents of a memory fragment, the interface we have to use is cumbersome:

vector<char> b (data, data + size);

What we want to write instead is this:

buffer b (data, size);

This initialization is also potentially inefficient. Depending on the quality of the implementation, std::vector may end up using a for loop instead of memcpy to copy the data. In fact, that’s exactly how it is done in GCC 4.5 and VC++ 2010/10.0 (Correction: as was pointed out in the comments, both GCC 4.5 and VC++ 10 optimize the case where the vector element is POD).

So I think it is quite clear that while vector<char> is workable, it is not particularly convenient or efficient.

Also, as it turns out this is not the first time I am playing with the idea of a dedicated buffer class in C++. A couple of months ago I started a thread on the Boost developer mailing list trying to see if there would be any interest in a simple buffer library in Boost. The result wasn’t very encouraging. The thread quickly splintered into discussions of various special-purpose, buffer-like data structures that people have in their applications.

On the other hand, I mentioned the buffer class at BoostCon 2011 to a couple of Boost users and got very positive responses, along the “If it were there we would use it!” lines. That’s when I got the idea of writing this article in an attempt to get feedback from the broader C++ community rather than from just the hard-core Boost developers (only they can withstand the boost-dev mailing list traffic).

While the above discussion should give you a pretty good idea about the kind of buffer class I am talking about, below I am going to show a proposed interface and provide a complete, header-only implementation (released under the Boost license), in case you would like to give it a try.

class buffer
  typedef std::size_t size_type;
  static const size_type npos = -1;
  ~buffer ();
  explicit buffer (size_type size = 0);
  buffer (size_type size, size_type capacity);
  buffer (const void* data, size_type size);
  buffer (const void* data, size_type size, size_type capacity);
  buffer (void* data, size_type size, size_type capacity,
          bool assume_ownership);
  buffer (const buffer&);
  buffer& operator= (const buffer&);
  void swap (buffer&);
  char* detach ();
  void assign (const void* data, size_type size);
  void assign (void* data, size_type size, size_type capacity,
               bool assume_ownership);
  void append (const buffer&);
  void append (const void* data, size_type size);
  void fill (char value = 0);
  size_type size () const;
  bool size (size_type);
  size_type capacity () const;
  bool capacity (size_type);
  bool empty () const;
  void clear ();
  char* data ();
  const char* data () const;
  char& operator[] (size_type);
  char operator[] (size_type) const;
  char& at (size_type);
  char at (size_type) const;
  size_type find (char, size_type pos = 0) const;
  size_type rfind (char, size_type pos = npos) const;
  char* data_;
  size_type size_;
  size_type capacity_;
  bool free_;
bool operator== (const buffer&, const buffer&);
bool operator!= (const buffer&, const buffer&);

Most of the interface should be self-explanatory. The last overloaded constructor allows us to create a buffer by reusing an existing memory block. If the assume_ownership argument is true, then the buffer object will free the memory using delete[]. The detach() function is the mirror side of this functionality in that it allows us to detach the underlying memory block and reuse it in some other way. After the call to detach() the buffer object becomes empty and we should eventually free the returned memory using delete[]. The size() and capacity() modifiers return true to indicate that the underlying buffer address has changed, in case we cached it somewhere.

So, do you think we need something like this in Boost and perhaps in the C++ standard library? Do you like the proposed interface?

Microsoft DLL export and C++ templates

Monday, January 18th, 2010

The other day I stumbled upon a really dark corner of the Microsoft dllexport/dllimport machinery. I can vividly see Windows toolchain engineers waking up in the middle of the night from a nightmare where they had to patch yet another crack in this DLL symbol export mess. This one has to do with the interaction of dllexport and C++ templates.

It all started with a user reporting duplicate symbol errors when he tried to split the XSD-generated code into two DLLs. The duplicate symbols were reported when linking the second DLL that depends on the “base” DLL and pointed to the destructor and assignment operator of a template instantiation, let’s say std::vector<int>. There were two additional strange things about this case: the errors only occurred in the debug build and there were a number of other users that have done a similar thing but never got any errors. The fact that the errors only appeared in the debug build got me thinking that in the release build these functions were inlined. The second strange aspect was harder to figure out: there was something special about this particular codebase that caused the error. After some investigation the following code fragment in the first DLL turned out to make the difference (BASE_EXPORT expands to either __declspec(dllexport) or __declspec(dllimport)):

class BASE_EXPORT ints: public std::vector<int>

As it turns out (see at the end of the General Rules and Limitations article in MSDN), if an exported class inherits from a template instantiation that is not explicitly exported (yes, you can export certain instantiations of a template, see below), then the compiler implicitly applies dllexport to this template instantiation. So the above code fragment exports both the ints class and the std::vector<int> instantiation. On the surface this automatic exporting looks like a good idea. After all, if you export the derived class you will also need to export all its public bases since they are part of the interface. In the case of the non-template bases you need to use the export mechanism explicitly which makes sense. In the case of templates, you don’t want to have to explicitly export every instantiation. Plus, as pointed out in the MSDN article above, it is not always possible.

But here is the other half of the picture: in the second DLL there is a source code file that doesn’t know anything about the ints class (that is, it doesn’t include the ints declaration). It also happens to use std::vector<int> in a fairly common way:

void f ()
  std::vector<int> v;

When the second DLL is linked, we end up with two sets of symbols for std::vector<int>: the first is exported from the “base” DLL and the second set is the result of the template instantiation in the above source code file. Duplicate symbol errors ensue.

At first it might seem puzzling that the same doesn’t happen with ordinary classes that contain inline functions. What if a class is exported from one DLL and then we use it in another? This doesn’t lead to errors even when inline functions are not inlined because in order to use the class we need to include its declaration. Once we do that all of its functions become imported from the first DLL and instead of “instantiating” an inline function the compiler simply uses the imported version from the first DLL. We get errors in the above scenario because when VC++ compiles the source file in the second DLL it has no knowledge of the fact that the functions it is about to instantiate were exported from the “base” DLL which this DLL happens to link to.

In standard C++ the toolchain is required to weed out the duplicate symbols that result from instantiations of the same template. When DLLs are involved, VC++ is unable to meet this requirement.

There is no clean way to work around this. In the scenario described above we can add an explicit import declaration for the std::vector<int> instantiation:

template class __declspec(dllimport) std::vector<int>;
void f ()
  std::vector<int> v;

Normally one would collect such manual imports in one header file and then include this file into every source file in the DLL.

The major issue with this approach, apart from having to manually track imports, is that if you have two independent DLLs that each happen to auto-export std::vector<int> and you need to link to both of them, there is nothing you can do without changing at least one of those DLLs.

It also appears that Microsoft itself suffered from this pitfall as evident from the Exporting String Classes Using CStringT article in MSDN. The solution that it describes seems to be specific to this particular case, not that I could understand it fully.