[odb-users] Automatic generation of C++ classes from database
schema
Per Edin
info at peredin.com
Sun Feb 2 12:05:30 EST 2014
Hi,
Regarding the "rough draft" approach, this could be achieved by
including a region in all generated files that will be preserved when
updating the schema. A checksum could also be added to each file to
detect changes that would be overwritten by an update. A --force
option could be added to force re-generation of modified source files.
--rough-draft
Include a user-modifiable region in each file that will be preserved
on subsequent runs of the dumper. Exclude any public accessors unless
requested explicitly with --include-accessors.
--data-access
Generate simple data-access classes without any user-modifiable parts.
Public accessors are implicitly included. There is no
--exclude-accessors since excluding them would render data-access
classes completely useless.
A default constructor could be added to the user-modifiable region
when running the command for the first time.
Per
On Fri, Jan 31, 2014 at 6:11 AM, Boris Kolpackov
<boris at codesynthesis.com> wrote:
>
> Hi All,
>
> There seems to be quite a bit of interest in being able to
> automatically generate C++ classes from the database schema.
> However, this is a fairly "hairy" feature in the sense that
> there are a lot of unclear/complex aspects that need to be
> better understood. This is especially so since we are trying
> to design a general tool.
>
> The goal of this thread is to try and flesh-out an overall
> design for this feature based on experience and use-cases.
> So if you have some ideas or a need for this functionality,
> feel free to chime in.
>
> I've been thinking about this on and off for a couple of
> years now and here is an initial list of things that I
> believe we need to consider/discuss. Note also that not
> all of these features/ideas will be implemented in the
> first version (or even ever). However, it is a good
> idea to think through them to a certain level in order
> to understand how everything fits (or will fit) together.
>
> * What is the input to this tool? It can be an .sql file
> (dump from the database or manually created/maintained).
> Or it could be programmatically retrieved from a running
> database instance.
>
> The .sql approach feels cleanest to me but the complexity
> of parsing SQL is probably too much (don't believe me?
> check the Oracle SQL reference ;-)).
>
> The programmatic approach is probably the most practical
> even though it has a number of serious drawbacks (like
> the need to connect to a running database). Also, most
> likely it will be a separate tool that connects to the
> database and extracts the schema since we cannot link
> the ODB compiler to every database API library. So we
> need some kind of an intermediate format that the tool
> can produce and the ODB compiler can read. The XML
> format that we already have for the schema evolution
> sounds like a good candidate.
>
> Other things to consider in this area:
>
> - A way to limit the list of tables considered.
>
> - Do we use the ODB runtimes to access databases or
> should we just use the C APIs? Runtimes are
> not that convenient for manual database access
> though we could probably improve that. Also, for
> cases where we need to run plain SQL queries (as
> opposed to a special-purpose C API), we could even
> use ODB (views, etc).
>
> - We could make the ODB compiler call the extraction
> tool automatically and pipe the output to it.
>
> * What is the output of the tool?
>
> - File per class? File per schema? Something in-between.
> For large schemas, the file-per-schema approach is not
> going to scale, especially when the database support
> code generated by ODB is concerned. The file per class
> approach can also get unwieldy very quickly for a large
> number of classes. We have the same problem in XSD
> (may end up with a couple of thousand source files).
> It is manageable but not pretty.
>
> The in-between solution is to somehow allow the user
> to specify how to group classes into files (e.g.,
> all related classes in a single file).
>
> * Intended uses: "rough draft" or "data access".
>
> What happens if/when the schema changes? Does the user
> re-generate the classes or update them manually?
>
> In other words, is this feature going to generate classes
> that are the "rough draft" and the user can fill them in
> with customizations (e.g., functions) or are they only for
> "data access" (i.e., don't have anything other than
> accessors and modifiers)?
>
> The problem with the "rough draft" approach is what
> happens when the schema changes and re-generating
> the classes will loose those customizations?
>
> The problem with the "data access" approach is that
> no functionality/logic can be added to the generated
> classes.
>
> We will probably have to support both use-cases.
>
> * Support for customization?
>
> There are some options for supporting the customization of
> the generated classes though none of them are particularly
> elegant.
>
> We could also consider doing the unspeakable and extract
> user customizations from the C++ header files. The only
> reason why I am even bringing this option up is because we
> are C++-parsing this file anyway (during the database support
> code generation). The user will still have to mark the
> regions (e.g., with pragmas which ODB could pre-insert
> for each class) so it could be brittle (if you make your
> changes in the wrong place, they will be gone). Though
> there doesn't seem to be anything better.
>
> * Basic types mapping (string, containers, smart pointers)
>
> Different users will want different basic types to be used
> in their generated classes (std::string, QString, etc).
> In a sense, this is a reverse mapping of what ODB currently
> does: C++ type to database type. What we need is a database
> type to C++ type mapping. The big question is how and where
> it is specified.
>
> It would also be nice if this somehow tied up to profiles.
> That is, if a user specified -p qt, then ODB will use
> Qt types (QString, Qt smart pointers, Qt containers, etc)
> in the generated C++ classes automatically.
>
> * Mapping for relationships, containers, (polymorphic)
> inheritance.
>
> This one is hard. ODB would somehow need to recognize
> certain patterns and map them to relationships, containers,
> etc. It may also need user guidance (see mapping
> customization/annotations).
>
> Generally, there are a lot more ways to structure
> these things (relationships, containers, inheritance)
> in relational databases than in C++ so for more esoteric
> cases there might not even be a sensible mapping. What
> would be nice is to come up with a general mechanism
> that would allow the user to specify the mapping for such
> cases. The big problem, of course, is that it can become
> so complex (see Hibernate and their relationship mapping)
> as to be completely unusable.
>
> An alternative could be to only support the straightforward
> cases and map the rest to plain objects for the user to
> deal with (i.e., one will be able to access the data but
> working with it won't be very convenient).
>
> * Mapping customization/annotations.
>
> Where and how is it specified?
>
> Things that the user may want to specify:
>
> - which tables to map
> - how to map tables (container, poly-inheritance, etc)
> - column type mapping
>
> * Naming convention used in the generated classes.
>
> We have licked this problem nicely in XSD. The idea is
> to use a set of regex patterns to transform names to
> conform to a specific naming convention. XSD comes
> with a set of predefined patterns (K&R, Camel Case,
> and Java). The user can "adjust" one of these with
> a few regex'es of their own or can create a completely
> custom naming convention. We should most likely just
> use the same mechanism since it seems to work great.
>
> Probably should also make spacing/indentation adjustable,
> especially if the user is expected to add their code to
> the generated files (see customization).
>
More information about the odb-users
mailing list