[odb-users] Optimal way of checking if DB entry exists before persisting?

Fri Jun 24 07:19:06 EDT 2011

Hi Thomas,

Szumowski, Thomas <thomas.szumowski at lmco.com> writes:

> Suppose I have have an "employee" object that as a member has a pointer
> to an "employer" object as in the one-way relationship example from the
> user guide. I am periodically receiving "employee" objects from an
> external system and would like to persist the employees in one table
> and corresponding employers in another (as in the example).
> 
> If I blindly try to persist the employer (before attempting to persist
> employee), and an object with that employer ID already exists in the 
> database, ODB throws an error stating the object is already persisted.
> As an alternative, upon receipt of an employee object, I can do the
> following:
> 
> - open transaction
> - Get employer ID from received employee object
> - Query the database to see if this employer ID exists
>                 - If it does, then call db->update(..)
>                 - If it doesn't, then call db->persist(..)
> - commit
> 
> This procedure is similar to that outlined at the end of section 2.6 
> from the manual.
> 
> My question are:
>
> (1) Is this the most efficient way of updating/persisting objects that
>     may already exist in the DB?

There are several ways to optimize this:

1. Instead of using the query() function you can use find(). find() is
   quite a bit faster.

2. If you know that in most cases the object will not be in the database,
   then you can do a speculative persist() call. If it fails with the
   object_already_persistent exception, then you fall back to the 
   load()/update() call sequence. This method doesn't work as well for
   the situation where in most cases the object will be in the database
   since you need to call load() (which is equivalent to find()) before
   calling update().

3. If your application is the only entity that persists/updates employer
   objects, then you can maintain a map of such objects that outlive
   individual transactions. This way you can avoid doing find().

> (2) Since the objects come in periodically, is there a more efficient
>     way of persisting each employee to the database? I assume if the
>     data rate is high, doing commits so often may be inefficient and
>     it may be better to commit based off some timer periodically and
>     cache. (Is there a similar feature in ODB?)

This depends on your application. If you have multiple applications
accessing the database or a multi-threaded application, then having
long transactions will hurt concurrency. While it depends on the
database, I am also not certain that having a one long transaction
instead of a series of smaller ones will necessarily lead to better
performance.

In the above scenario, you could combine the transactions that persist
the employee and employer object into one transactions. Also, while ODB
doesn't support some sort of a timer-based auto-commit, you can easily
implement something like this yourself. For example, by counting the
number of objects that were updated/persisted and committing the current
transaction after a certain threshold has been reached.

Boris