GCC can now be built with a C++ compiler

You probably heard about the decision to allow the use of C++ in GCC itself. But it is one thing to say this and completely different to actually making a large code base like GCC to even compile with a C++ compiler instead of C. Well, GCC 4.7 got one step closer to this and can now be compiled with either a C or C++ compiler. Starting with 4.8, it is planned to build GCC in the C++ mode by default. Here is the C++ Build Status page for GCC 4.8 on various targets.

To enable the C++ mode in GCC 4.7, we use the --enable-build-with-cxx GCC configure option. As one would expect, different distributions made different decisions about how to build GCC 4.7. For example, Debian and Ubuntu use C++ while Arch Linux uses C. These differences are not visible to a typical GCC user which is why neither the GCC 4.7 release notes nor the distributions mention any of this. In fact, I didn’t know about the new C++ build mode until ODB, which is implemented as a GCC plugin, mysteriously failed to load with GCC 4.7. This “war story” is actually quite interesting so I am going to tell it below. At the end I will also discuss some implications of this change for GCC plugin development.

But first a quick recap on the GCC plugin architecture: GCC plugin is a shared object (.so) that is dynamically-loaded using the dlopen()/dlsym() API. As you may already know, with such dynamically-loaded shared objects, symbol exporting can work both ways: the executable can use symbols from the shared object and the shared object can use symbols from the executable, provided this executable was built with the -rdynamic option in order to export its symbols. This back-exporting (from executable to shared object) is quite common in GCC plugins since to do anything useful a plugin will most likely need to call some GCC functions.

Ok, so I built ODB with GCC 4.7 and tried to run it for the first time. The error I got looked like this:

cc1plus: error: cannot load plugin odb.so
odb.so: undefined symbol: instantiate_decl

Since the same code worked fine with GCC 4.5 and 4.6, my first thought was that in GCC 4.7 instantiate_decl() was removed, renamed, or made static. So I downloaded GCC source code and looked for instantiate_decl(). Nope, the function was there, the signature was unchanged, and it was still extern.

My next guess was that building GCC itself with the -rdynamic option was somehow botched in 4.7. So I grabbed Debian build logs (this is all happening on a Debian box with Debian-packaged GCC 4.7.0) and examined the configure output. Nope, -rdynamic was passed as before.

This was getting weirder and weirder. Running out of ideas, I decided to examine the list of symbols that are in fact exported by cc1plus (this is the actual C++ compiler; g++ is just a compiler driver). Note that these are not the normal symbols which we see when we run nm (and which can be stripped). These symbols come from the dynamic symbol table and we need to use the -D|--dynamic nm option to see them:

$ nm -D /usr/lib/gcc/x86_64-linux-gnu/4.7.0/cc1plus | 
grep instantiate_decl
0000000000529c50 T _Z16instantiate_declP9tree_nodeib

Wait a second. This looks a lot like a mangled C++ name. Sure enough:

nm -D -C /usr/lib/gcc/x86_64-linux-gnu/4.7.0/cc1plus | 
grep instantiate_decl
0000000000529c50 T instantiate_decl(tree_node*, int, bool)

I then ran nm without grep and saw that all the text symbols are mangled. Then it hit me: GCC is now built with a C++ compiler!

Seeing that the ODB plugin is written in C++, you may be wondering why did it still reference instantiate_decl() as a C function? Prior to 4.7, GCC headers that a plugin had to include weren’t C++-aware. As a result, I had to wrap them in the extern "C" block. Because GCC 4.7 can be built either in C or C++ mode, that extern "C" block is only necessary in the former case. Luckily, the config.h GCC plugin header defines the ENABLE_BUILD_WITH_CXX macro which we can use to decide how we should include the rest of the GCC headers:

#include <config.h>
extern "C"
} // extern "C"

There is also an interesting implication of this switch to the C++ mode for GCC plugin writers. In order to work with GCC 4.7, a plugin will have to be compiled with a C++ compiler even if it is written in C. Once the GCC developers actually start using C++ in the GCC source code, it won’t be possible to write a plugin in C anymore.

Comments are closed.