[xsd-users] Keep a xsd library small
Boris Kolpackov
boris at codesynthesis.com
Fri Feb 13 08:56:49 EST 2009
Hi Angelo,
Angelo Difino <angelo at cedeo.net> writes:
> first of all I'd like to say that xsd-codesynthesis rocks: after just few
> days of it i was able to create the c++ classes of a quite 'complicated'
> set of schemas (that was already posted here few mouths ago').
Thanks, I am glad you find it useful.
> Since this set of schema suffer of cyclic dependencies with inheritance,
> I'm using the file-per-type option. I'm running the last release (3.2.0)
> of XSD on win32/vista...
>
> Everything works great, but the problem is the huge amount of files
> and its size when I try to compile it (i'm using MSVisual c++ 2003).
> The header and class file counts 872 and the built library is of 1GB
> of size
You are probably building a static library. Unfortunately with
the file-per-type mode static libraries for non-trivial schemas
are bound to be quite large. This is due to the large number of
source files which, when compiled, all include instantiations of
some common templates (this is especialy true when the --generate-
polymorphic option is used). The static library is pretty much the
archive of all the object files. When it is linked to an executable,
the linker will remove all those duplicate template instantiations
so the resulting binary will have the same size as if you used the
file-per-schema mode.
One solution to this problem is to use a shared library (DLL)
instead of a static library since a shared library is mode like
an executable in that all the duplicate template instantiations
are removed. For example, I compiled your schemas on my GNU/Linux
box and while the static library is 227MB, the shared library is
10MB.
Here are some more tips for reducing the size/compilation time:
1. Specify root element with the --root-element option. In your
schema I eliminated about a hundred parsing/serialization
functions by adding '--root-element DIDL' to the command
line.
2. Your schema is composed of several lower-level schema subsets.
If only one of the lower-level subsets involve the cyclic
dependency, then you can compile only this subset in the
file-per-type mode and the rest in the default, file-per-
schema mode. Note that here the subset needs to be fairly
isolated in that all the schemas that it includes/imports
will be handled in the file-per-type mode. In your case,
there are two subsets that involve cyclic dependencies
(rel-*.xsd and ipmpmsg.xsd/ipmpinfo.xsd) so the bulk of
the schema has to be compiled in the file-per-type mode.
However, there are still a few files that can be compiled
in the file-per-schema mode, namely, didl.xsd, didl-msx.xsd,
and mpeg4smp.xsd.
I compiled your schemas like so:
xsd cxx-tree --file-per-type ... rel-r.xsd ipmpinfo.xsd
xsd cxx-tree ... didl.xsd didl-msx.xsd mpeg4smp.xsd
And the static library size went down to 175Mb.
3. You can also try to split the offending schemas into two or
more files so that they don't involve cyclic dependencies
with inheritance (the resulting schema will be semantically
equivalent to the original). Then you can use the file-per-
schema mode.
4. When using the file-per-type mode it is recommended to use
precompiled headers to speed-up compilation. You would normally
include xml-schema.hxx into the precompiled header and then
include the precompiled header into each generated source
file using the --cxx-prologue option.
Boris
More information about the xsd-users
mailing list