Archive for the ‘Development’ Category

ODB - compiler-based ORM for C++

Wednesday, September 29th, 2010

If you have read my earlier posts on parsing C++ with GCC plugins (Part 1, Part 2, and Part 3), then you might remember that I mentioned a secret project that I have been working on. You might also have noticed that I’ve been neglecting this blog lately. Well, today is the day to unveil this secret project, which is what kept me busy for these past several months.

The project is called ODB and it is a compiler-based object-relational mapping (ORM) system for C++. It allows you to persist C++ objects to a relational database without having to deal with tables, columns, or SQL.

You might have already used other ORM implementations for C++. And if you have been exposed to ORM systems for other mainstream languages, such as Hibernate for Java, the C++ versions must have felt pretty inferior. The major sore point is the need to write some sort of serialization or registration code for each and every data member in each and every persistent class. Forgot to register a new member? Say good bye to your data.

The primary goal of the ODB project is to change that. It takes a different approach and uses a C++ compiler to parse your classes and automatically generate the database conversion code. Or, more precisely, it uses the new GCC plugin architecture to re-use the tried and tested GCC compiler frontend to parse C++. As a result, ODB is capable of handling any C++ code. While the ODB compiler uses GCC internally, its output is standard C++ which means that you can use any C++ compiler to build the generated code and your application.

Let’s see how a persistent class declaration will look in ODB:

  #pragma db object
  class person
    friend class odb::access;
    person ();
    #pragma db id auto
    unsigned long id_;
    string first_;
    string last_;
    unsigned short age_;

ODB is not a framework. It does not dictate how you should write your application. Rather, it is designed to fit into your style and architecture by only handling C++ object persistence and not interfering with any other functionality. As you can see, existing classes can be made persistent with only a few modifications.

Given the above class, we can perform various database operations with its objects:

  person john ("John", "Doe", 31);
  person jane ("Jane", "Doe", 29);
  transaction t (db.begin ());
  db.persist (john);
  db.persist (jane);
  result r (db.query<person> (query::age < 30));
  copy (r.begin (),
        r.end (),
        ostream_iterator<person> (cout, "n"));
  jane.age (jane.age () + 1);
  db.update (jane);
  t.commit ();

ODB is written in portable C++ and you should be able to use it with any modern C++ compiler. We have tested this release on GNU/Linux (x86/x86-64), Windows (x86/x86-64), Mac OS X, and Solaris (x86/x86-64/SPARC) with GNU g++ 4.2.x-4.5.x, MS Visual C++ 2008 and 2010, and Sun Studio 12. The dependency-free ODB compiler binaries are available for all of the above platforms. The initial release only supports MySQL as the underlying database. Support for other database systems is in the works.

Well, I hope this sounds as exciting to you as it does to me. And I hope you will enjoy playing around with ODB (check out the Hello World Example if nothing else) while I go catch up on some sleep.

Free proprietary license for XSD and XSD/e

Tuesday, August 3rd, 2010

Today we introduced a free proprietary license for CodeSynthesis XSD and XSD/e. The new license allows you to handle small XML vocabularies (less than 10,000 lines of generated code) in proprietary/closed-source applications free of charge and without any of the GPL restrictions such as having to publish your source code.

What were the reasons for offering such a license? After all, it seems like we will just loose money on this deal. We often get requests for our commercial proprietary license from developers that have a fairly small XML vocabulary. Typically a configuration file or a small communication protocol for their application. While the XML documents are quite simple and it wouldn’t be very hard to parse them using DOM or SAX, the developers would still prefer to handle this task using our tools. After all, spending a few days writing mind-numbing code is still worse than generating the same code in a few seconds.

However, the administrative burdens and delays involved in such a purchase (getting approval from management, contacting the purchasing department, purchasing via PO or credit card, etc.) are often hard to justify considering such simple XML processing needs. The administrative overheads on our side (processing the PO or credit card, delivering the license, issuing the invoice, etc.) also force us to set a minimum limit on the license size and price that we can offer.

All this usually leads to either the license being too expensive for the task at hand or the understandable unwillingness of the developers to endure the purchasing process. As a result we have decided to spare the developers the agony of using inferior products and/or raw XML processing APIs and offer this license for free.

How much is 10,000 lines of code? While it depends on the optional XSD and XSD/e compiler features that you use (e.g., support for XML serialization, polymorphism, comparison and printing operators, as well as XML Schema validation in case of XSD/e), as a rule of thumb, 10,000 lines of code are roughly equivalent to 40-50 local element/attribute definitions in the schema. This should be sufficient to handle small and and even some medium-sized XML vocabularies. Also, if you have your schemas ready, you can quickly check how much generated code they require by downloading XSD or XSD/e and passing the --show-sloc option when compiling the schemas.

For more information on the new license as well as for answers to other common questions, see the following pages:

GNU make 3.82 released

Wednesday, July 28th, 2010

The next version of GNU make, 3.82, was released today. The build system that is used in all of Code Synthesis’ products relies heavily on GNU make so I have been contributing to this project for a couple of releases now. For this version I have implemented a few new features, fixed a number of bugs, and performed a number of optimizations. In this post I would like to discuss the most notable new additions (not necessarily made by me). For the complete list of user-visible changes refer to the NEWS file in the distribution.

Customizable recipe prefix

People who are new to make usually complain about the use of the tab character as a recipe prefix (the command part in the rule). Now it is possible to choose any symbol (one character) that you want with the .RECIPEPREFIX special variable, for example:

:echo all

If you decide to change the prefix, then using a colon is actually not a bad choice since this character is already used by make as a rule separator and is therefore unlikely to appear in target or variable names.

Define improvements

Prior to 3.82 the define operator could only create recursively-expanding variables. It was also not possible to mark the variable as exported or overriding (export and override modifiers). In make 3.82 it is now possible to add a modifier as well as to specify the assignment type (simple, conditional, or appending). Here are a couple of examples:

override define foo
define bar :=

Now there is also the undefine operator which allows you to completely erase a variable so that it appears as if it was never set. Before, the closest you could get to this behavior is to set the variable to an empty value. However, some GNU make functions, such as $(origin) and $(flavor) will still show the difference between a variable that was never defined and the one that contains an empty value.

Private variables

In GNU make it is possible to set a variable in a target-specific manner. Such a variable is only visible in the scope of this target, that is, in the rule recipes and when setting other target-specific variables. For example:

foo: x := bar
foo: y := $x
    @echo $x $y

One curious feature of target-specific variables in GNU make is the inheritance of such variables by the prerequisites of this target, provided that the making of the target triggered the making of the prerequisite. The following makefile fragment is the canonical motivating example for this feature:

debug: CFLAGS := -g
debug: driver.o
    $(CC) $(CFLAGS) -o $@ $^
release: driver.o
    $(CC) $(CFLAGS) -o $@ $^
driver.o: driver.c
    $(CC) $(CFLAGS) -c -o $@ $<

Here, if we run make as make debug, the debug target will trigger the update of driver.o and as a result the debug target’s CFLAGS value will be inherited by this prerequisite.

While this feature could be useful, such uncontrolled inheritance can also cause problems. There is also the view that building the same prerequisite differently depending on which target triggered the rebuild is a bad idea (consider what will happen in the above example if we had an up-to-date driver.o file that was created with the make release invocation).

In GNU make 3.82 it is now possible to mark a target-specific variable as private which means that it will not be inherited by the prerequisites:

debug: private CFLAGS := -g

It is also possible to mark a global variable private. In this case the variable will not be visible to any targets and their recipes.

New pattern ordering

Before this release, GNU make would match pattern rules (and pattern-specific variables) in the order they were defined. The first rule that matches is then used. Consider the following example:

%.o: src/%.c
    $(CC) $(CFLAGS) -o $@ $^
pic/%.o: src/%.c
    $(CC) -fPIC $(CFLAGS) -o $@ $^
all: libfoo.a
libfoo.a: foo.o pic/foo.o

Here we want to use a special rule to compile position-independent code and the normal rule otherwise. The problem is that the normal rule will also match the position-independent files. It is easy to fix this makefile by simply reordering the rules. However, this approach may not scale to more complex, multi-makefile build systems. To address this issue, GNU make now tries the rules in the shortest stem first order which results in the more specific rules being preferred over the more generic ones.