[odb-users] using special db field types

Boris Kolpackov boris at codesynthesis.com
Tue Jun 12 12:00:56 EDT 2012


Hi Giles,

Giles Forster <gforster at liquidcapital.com> writes:
 
> As you can see this is quite a compact structure compared to the default
> schema that would be created by ODB if I used a vector<double> for x and y
> in the source C++ class where it would create 3 tables, of which two would
> be indexed back to the first with a join to the series table.

I agree that using PostgreSQL array support results in a more concise
schema. Whether it results in better performance, I don't know; it
probably depends on the number of elements in the arrays as well as
the access patterns of the application.

Also, with the separate table approach, the more natural way to
arrange things would be to define a composite value instead of
having two parallel arrays (this is similar to your use of the
PostgreSQL point type):

#pragma db value
struct point
{
  double x;
  double y;
};

#pragma db object
class series
{
  ...
  std::vector<point> data;
};

This will also result in two tables instead of three.  
 
> Is there a way to make use of a db's SQL 'ARRAY' field type?

Not at the moment. I did some reading on this feature and in the
case of PostgreSQL, to support arrays we would need to parse their
undocumented binary format (ODB uses binary format for passing
parameters and receiving results, for efficiency). This format
appears to be quite complex (see this discussion[1], for example)
and I am not sure we want to go this route at this stage. We would
have to implement array support for every built-in type and then
there will still be a question about other/user-defined types,
like point.

However, there seems to be another way to add support for arrays.
It is possible to interface with PostgreSQL arrays using their
string representations (see, for example, array_to_string() and
string_to_array() functions). This approach can also be used
to handle other extended data types, such as point.

In fact, some time ago we already had a request for this functionality,
except in that case it was about the GEOMETRY extension. Here is my
earlier reply that outlined this idea in a bit more detail:

http://www.codesynthesis.com/pipermail/odb-users/2011-September/000312.html

The way I currently see this working is as follows: There would be
a pragma that would allow one to introduce additional database types
as well as information on how they can be mapped to one of the known
built-in types. Essentially, the ODB compiler will need to know
three things about the new type:

1. Its "interface type" (i.e., the built-in type that will be used
   to send/receive the data).

2. The "to" conversion expression that will be used in the SELECT
   statements.

3. The "from" conversion expression that will be used in the UPDATE
   and INSERT statements.

Also having the type name specified as a regex pattern and the above
three pieces of information as regex substitutions sounds like it will
give some nice flexibility to the mechanism. So, for the PostgreSQL
one-dimension array of double, the pragma could look like this:

#pragma db type("DOUBLE PRECISION\[[^[]\]")        \
           as("TEXT")                              \
           to("string_to_array(?, ',', 'NULL')")   \
           from("array_to_string(?, ',', 'NULL')")

After that, one would need to also provide a value_traits specialization,
say, for std::vector<double> that would be able to parse/serialize the
string representation of the array of doubles.

The good news is that, AFAICS, this should be fairly straightforward
to implement in the ODB compiler. So if you would like to use this
feature, let me know, and I will implement it and have something for
you to try.

[1] http://stackoverflow.com/questions/4016412/postgresqls-libpq-encoding-for-binary-transport-of-array-data

Boris



More information about the odb-users mailing list