Archive for the ‘C++’ Category

Virtual inheritance overhead in g++

Thursday, April 17th, 2008

By now every C++ engineer worth her salt knows that virtual inheritance is not free. It has object code, runtime (both CPU and memory), as well as compilation time and memory overheads (for an in-depth discussion on how virtual inheritance is implemented in C++ compilers see “Inside the C++ Object Model” by Stanley Lippman). In this post I would like to consider the object code as well as compilation time and memory overheads since in modern C++ implementations these are normally sacrificed for the runtime speed and can present major surprises. Unlike existing studies on this subject, I won’t bore you with “academic” metrics such as per class or per virtual function overhead or synthetic tests. Such metrics and tests have two main problems: they don’t give a feeling of the overhead experienced by real-world applications and they don’t factor in the extra code necessary to account for the lack of functionality otherwise provided by virtual inheritance.

It is hard to come by non-trivial applications that can provide the same functionality with and without virtual inheritance. I happened to have access to such an application and what follows is a quick description of the problem virtual inheritance was used to solve. I will then present some measurements of the overhead by comparing to the same functionality implemented without virtual inheritance.

The application in question is XSD/e, validating XML parser/serializer generator for embedded systems. Given a definition of an XML vocabulary in XML Schema it generates a parser skeleton (C++ class) for each type defined in that vocabulary. Types in XML Schema can derive from each other and if two types are related by inheritance then it is often desirable to be able to reuse the base parser implementation in the derived one. To support this requirement, the current implementation of XSD/e uses the C++ mixin idiom that relies on virtual inheritance:

// Parser skeletons. Generated by XSD/e.
//
struct base
{
  virtual void
  foo () = 0;
};
 
struct derived: virtual base
{
  virtual void
  bar () = 0;
};
 
// Parser implementations. Hand-written.
//
struct base_impl: virtual base
{
  virtual void
  foo ()
  {
    ...
  }
};
 
struct derived_impl: virtual derived,
                     base_impl
{
  virtual void
  bar ()
  {
    ...
  }
};

This approach works well but we quickly found out that for large vocabularies with hundreds of types the resulting object code produced by g++ was unacceptably large. Furthermore, on a schema with a little more than a thousand types, g++ with optimization turned on (-O2) runs out of memory on a machine with 2GB of RAM.

After some analysis we determined that virtual inheritance was to blame. To resolve this problem we have developed an alternative, delegation-based implementation reuse method (will appear in the next release of XSD/e) that is almost as convenient to use as mixin (this is the case because all the support code is automatically generated by the XSD/e compiler). The idea behind the delegation-based approach is illustrated in the following code fragment:

// Parser skeletons. Generated by XSD/e.
//
struct base
{
  virtual void
  foo () = 0;
};
 
struct derived: base
{
  derived (base* impl)
    : impl_ (impl)
  {
  }
 
  virtual void
  bar () = 0;
 
  virtual void
  foo ()
  {
    assert (impl_);
    impl_->foo ();
  }
 
private:
  base* impl_;
};
 
// Parser implementations. Hand-written.
//
struct base_impl: base
{
  virtual void
  foo ()
  {
    ...
  }
};
 
struct derived_impl: derived
{
  derived_impl ()
    : derived (&base_impl_)
  {
  }
 
  virtual void
  bar ()
  {
    ...
  }
 
private:
  base_impl base_impl_;
};

The optimized for size (-Os) and stripped test executable built for the above-mentioned thousand-types schema using virtual inheritance is 15MB in size. It also takes 19 minutes to build and peak memory usage of the C++ compiler is 1.6GB. For comparison, the same executable built using the delegation-based approach is 3.7MB in size, takes 14 minutes to build, and peak memory usage is 348MB. That’s right, the executable is 4 times smaller. Note also that the generated parser skeletons are not just a bunch of pure virtual function signatures. They include XML Schema validation, data conversion, and dispatch code. The measurements also showed that the runtime performance of the two reuse approaches is about the same (most likely because g++ performs a similar delegation under the hood except that it has to handle all possible use-cases thus the object code overhead).

Xerces-C++ 3.0.0 beta 1 released

Friday, March 14th, 2008

I’ve spent the past three weeks prepping the Xerces-C++ 3.0.0 code for the upcoming release which culminated in the publishing of the first beta yesterday. The major change in 3.0.0 compared to the 2-series releases is the new, autotools-based build system for Linux/UNIX platforms. Other improvements in 3.0.0 include:

  • Project files for VC 9
  • Support for the ICU transcoder in VC 7.1, 8, and 9 project files
  • libcurl-based net accessor
  • Support for XInclude
  • Support for a subset of XPath
  • Conformance to the final DOM Level 3 interface specification
  • Ability to provide custom DOM memory manager
  • Better 64-bit support
  • Cleaned up error messages
  • Better tested, including against W3C XML Schema test suite
  • Removal of the deprecated code

My primary goals in this release are to make it cleaner, easier to build, better tested, as well as to provide better XML Schema support. And it does feel that the 3.0.0 codebase is on track to achieve these goals. If you are planning to upgrade to 3.0.0 once the final version is out, I suggest that you give this beta a try and report any problems so that they can be fixed before the final release. For more details on this beta see the official announcement.

CodeSynthesis XSD 3.1.0 released

Wednesday, February 13th, 2008

XSD 3.1.0 was released a couple of days ago. For an exhaustive list of new features see the official announcement. In this post I would like to go into more detail on a few major features, namely the file-per-type compilation mode and configurable identifier naming conventions.

File-per-type compilation mode

First, some background on the kinds of problems this feature is meant to solve. While in most cases it is natural to generate one set of source files from each schema file and map XML Schema include and import constructs to the preprocessor #include directives, XML Schema include and import mechanisms are quite a bit less strict compared to #include. For example, you can have two schemas each with a type that inherits from a base in another schema (that is, these schemas are dependent on each other and this dependency involves inheritance). Or, you can have a schema that does not include/import definitions for all the types it is referencing. Instead such a schema relies on being included or imported into another schema which provides the missing definitions (while this can also happen in C++, it is not very common). As a result, sometimes it is not possible to compile the schemas separately and/or map XML Scheme include/import to C++ #include. For such situations the file-per-type compilation mode was introduced in addition to the existing file-per-schema mode.

In the new mode (the --file-per-type command option), the XSD compiler generates a separate set of files for each type defined in XML Schema. It still generates a set of source files corresponding to the schema files which now include the header files for the types and contain parsing and serialization functions. In this compilation mode you only need to compile the root schema for your vocabulary; the code will be automatically generated for all included and imported schemas. If your vocabulary has several root schemas which in turn include or import a common subset of schemas then you will need to specify all these root schemas in a single invocation of the compiler.

One reason why the file-per-schema mode should be preferred whenever possible is the potentially large number of source files that are generated in the file-per-type mode (some of the schemas that we have tested contain 1000-1,500 types). To minimize the impact of the file-per-type mode on the C++ compilation time, it is a good idea to generate the XML Schema namespace into a separate header file (see the --generate-xml-schema and --extern-xml-schema options) and to set up a precompiled header.

To help dealing with a potentially large number of files that the new mode produces, the new --file-list option was added to the XSD compiler that allows you to write a list of generated source files into a file. The --file-list-prologue, --file-list-epilogue, and --file-list-delim options allow you to turn this file into, for example, a makefile fragment with the list of files assigned to a variable. The following GNU make fragment shows how to put all of the above information together:

XSD    := ... # path to the XSD compiler
LIBXSD := ... # path to the XSD runtime library
 
driver:
 
# Schema compilation.
#
xsd      := ... # list of all schema files
xsd_root := ... # root schema(s)
 
-include gen.make
 
gen.make: $(xsd)
  $(XSD) cxx-tree --file-per-type --output-dir gen 
--file-list $@ --file-list-prologue "gen := " --file-list-delim " \\n" 
--extern-xml-schema xml-schema.xsd --cxx-prologue '#include "all.hxx"' 
$(xsd_root)
 
gen/xml-schema.hxx:
  $(XSD) cxx-tree --generate-xml-schema --output-dir gen xml-schema.xsd
 
src := driver.cxx $(filter %.cxx,$(gen))
obj := $(src:.cxx=.o)
 
# Precompiled header.
#
$(obj): gen/all.hxx.gch
 
gen/all.hxx.gch: gen/all.hxx gen/xml-schema.hxx
  $(CXX) -I$(LIBXSD) -o $@ $<
 
# Object code and driver.
#
driver: $(obj) -lxerces-c
  $(CXX) -o $@ $^
 
%.o: %.cxx
  $(CXX) -I$(LIBXSD) -c $< -o $@

The gen/all.hxx file is the precompiled header for the project and could look like this:

#ifndef GEN_ALL_HXX
#define GEN_ALL_HXX
 
#warning precompiled header is not used
 
#include "xml-schema.hxx"
 
#endif // GEN_ALL_HXX

Another interesting aspect of the file-per-type compilation mode is how it is implemented in XSD. A straightforward but complex approach would have been to support this mode in the code generators in addition to the file-per-schema mode. Instead, an internal schema graph transformation was implemented that transforms the semantic graph to make it appear as if each type is in a separate schema file. After this transformation the unchanged code generators are used as in the file-per-schema mode.

Configurable identifier naming conventions

One common objection to using automatic code generation is the difference between the identifier naming conventions used in a project and in the generated code. To address this concern, the XSD compiler allows you to specify a naming convention that should be used in the generated code for the C++/Tree mapping.

The two new options, --type-naming and --function-naming, allow you to select type and function naming conventions from a predefined set of widely-used styles. You can also provide regular expressions to customize or completely override one of the predefined styles.

Available type naming conventions are K&R (for example, test_type), upper-camel-case (for example, TestType), and Java (the same as upper-camel-case). Available function naming conventions are K&R (for example, test_function), lower-camel-case (for example, testFunction), and Java (for example, getTestFunction for accessors and setTestFunction for modifiers).

For more information see the NAMING CONVENTION section in the XSD Compiler Command Line Manual (man pages).