[xsd-users] in memory footprint

Boris Kolpackov boris at codesynthesis.com
Fri Oct 10 08:49:27 EDT 2008


Hi Ray,

Rizzuto, Raymond <Raymond.Rizzuto at sig.com> writes:

> The app runs under Linux, 32 bit. It grows in virtual image size, as 
> reported by top, from an initial 100meg to ~3gig.   If I comment out
> the logic that creates/updates the SimpleOrder, there is no memory
> growth, so it seems very likely that the issue is the size of the 
> in memory object.  This works out to about 1.5meg per SimpleOrder,
> which seems excessive since it is under 3K serialized out as XML.

A 1.5Mb object model for a 3Kb XML file definitely doesn't sound 
right. I've created a small test case with a memory profiler (based
on operator new/delete overloading). I used the schema from the 
library example which is quite heavy on structure (i.e., has a
number of sequences and optional members as well as small chunks
of data) so it should be closer to the worst case scenario on the
footprint aspect.

The test first creates an object model that, when serialized
(without pretty-printing), results in a 3Kb XML file. The test
measures various memory parameters during this creation. Here
is the result on 32-bit GNU/Linux box using g++ 4.1.3:

allocate:
  count: 182
  bytes: 5504

free:
  count: 4
  bytes: 60


So the object model takes 5444 bytes. I did some investigation
into where the memory goes and most of the overhead comes from
string/sequence/containment management data. For example, 
std::string has 13 bytes overhead. When it holds the "Test author"
string, the overhead is greater than the payload. If there were
more of the actual data (e.g., longer strings, greater number of
elements in sequences, etc.) then the overhead would becomes less
of an issue. 

The second part of the test clones this object model a number of
times. Here is the output for 1000 clones:

allocate:
  count: 130000
  bytes: 4260000

free:
  count: 0
  bytes: 0

The size per copy is 4260 bytes. It might be surprising that
it is less than 5444 bytes taken by the initial object. This
is due to the COW semantics of the std::string. All copies 
share the same string literals as the initial object.

Based on this test I tend to think that there is either something
special about your SimpleOrder schema or there is something else
going on. Perhaps you could profile the creation of a sample
SimpleOrder object as I did for the library schema. The test
case is available here:

http://www.codesynthesis.com/~boris/tmp/memprof.tar.gz

I would be interested to know the number that you get.

Boris




More information about the xsd-users mailing list