[odb-users] Performance and another question

Tue Jun 2 11:39:44 EDT 2015

Hi Boris,
Thanks for the input. Boost::graph can use many representations ( vectors,
lists, etc) as the basis of the data structure it uses for the graph itself
- it's just a template class(es) passed in that have to meet the
requirements. Let me think about this a bit..

Andrew

On Tue, Jun 2, 2015 at 2:57 AM, Boris Kolpackov <boris at codesynthesis.com>
wrote:

> Hi Andrew,
>
> Andrew Cunningham <andrew at a-cunningham.com> writes:
>
> > I am evaluating ODB as a replacement for an 'abandon-ware' commercial
> OODB.
> >
> > I have a question regarding performance that other ODB users might be
> able
> > to help with
> >
> > If we were looking for the best performance when working with lots of
> data
> > for a single user ( single process/multiple threads) on a Windows
> > workstation what would you suggest as the best SQL DB option ( i.e.
> SQLIte
> > vs MySQL vs. Postgres etc)
> >
> > Obviously I think that most people would suggest SQLite as you are
> writing
> > ‘in process’ with no external communication to a server process. However,
> > when working with our existing OODB vendor we got much suprisingly better
> > performance from using a 'local' server vs. ‘in-process” as the server
> > process could perform disk operations without blocking the client
> process.
> > The server client communication was over shared memory not network
> sockets
> > when both were on a local machine.
>
> This is actually an interesting question. Firstly, if you have a
> single process/single thread setup then SQLite will beat everything
> else hands down, by a very large margin.
>
> Now, when we move to the multi-threaded case, things get tricky.
> SQLite multi-threading support is really poor, especially the
> "write" side of it. So if you have a lot of threads and they are
> write-heavy (in other words, the worst kind of scenario), then
> SQLite starts to suffer, badly. For example, we have the 'threads'
> test in odb-tests which is this kind, write-heavy torture test.
> PostgreSQL (and recent MySQL) absolutely smoke SQLite if the
> machine has a decent number of cores (say 8 or more).
>
> So this is the spectrum. And the answer to your question depends
> on where on this spectrum is your application. The best approach,
> of course, is to create a test that mimics your application's
> workload and see how each database performs. Luckily, with ODB,
> it will be very easy to write a simple test that you can run
> against every database.
>
>
> > One more question: What would people recommend for modeling a ‘graph’ as
> a
> > persistent data type in odb. I use boost::graph  currently. I am trying
> to
> > think of a way to do it in ODB is not too awfully slow/clumsy/messy. Just
> > wondering if it has been done before.
>
> When mapping a C++ class to a database class in ODB, there are
> generally three ways to do it:
>
> 1. Map it to a corresponding database type (e.g., if your target
>    database supported the graph data type or something similar).
>
> 2. Map it to some kind of a "database data structure", i.e.,
>    one or more columns and/or tables that capture the data
>    in a suitable form.
>
>    In your case, for example, this could mean having column(s)
>    that contain containers (in PG) that represents the graph.
>
>    Or, perhaps, it makes sense to map it as a container. See,
>    for instance, how we mapped the Boost multi-index container
>    in the libodb-boost profile.
>
> 3. Map it to an opaque data structure, i.e., a BLOB. In this
>    case you simply store some binary representation of your
>    data structure.
>
> Which way is best depends on your requirements. (1) is of
> course the cleanest but I don't think there are build-in
> graph types supported by the databases we are talking about.
> I guess, one way to approach is to ask yourself how would
> you persist your graphs in a text file. Then try to model
> this format using (2). I unfortunately not familiar enough
> with the way these two graph types store their internal
> representations so cannot really suggest which way is best.
>
> But if you can outline what would be a good "on disk"
> representation, I could suggest how to map it to the
> database.
>
> Boris
>