[xsd-users] Re: Question about XML versionning
Boris Kolpackov
boris at codesynthesis.com
Fri Oct 14 09:04:08 EDT 2011
Hi Bruno,
[CC'ing the xsd-users mailing list to my reply.]
Ledoux, Bruno <Bruno.Ledoux at spotimage.fr> writes:
> Many of the "external" XML vocabularies (GML, HMA) we are handling using
> libxsd are versioned. Each time a new version is issued a new set of XSD
> files is produced and need to be integrated in our parsing mechanism.
> These new XSD files are not backward compatible.
>
> We would like to set up a factory mechanism allowing us to build the
> appropriate parser based on the version number of the XML file that
> needs to be parsed. This would allow us to build a unique parsing
> program handling the different versions.
>
> What would be the best approach to address this issue with respect
> to performance and memory usage ?
While having a separate program (e.g., an executable) that handles a
particular version of the XML vocabulary will work, it is probably
not the most efficient approach, unless you know which version each
XML file corresponds to without looking inside the file.
If you have to look inside the XML file, which would be needed, for
example, if the version is specified as an attribute in the root
element, then the best approach in this setup would be to have a
special "starter" program which parses just enough of the XML file
to determine its version and then starts the corresponding "parsing"
program. The easiest way to parse just the prefix of an XML file
would be to use a SAX (SAX2XMLReader) parser and stop (e.g., by
throwing an exception from the handler or simply by doing exec())
once you've got the version.
A more efficient way to handle this would be to handle all the
versions in a single executable. This way, you can first parse the
XML document to DOM, determine its version, and then pass this same
DOM tree to the corresponding XSD-generated parsing function to
be parsed to the object model. In fact, this approach is very
similar to how one can handle XML documents with varying root
elements. The 'multiroot' example in the examples/cxx/tree/
directory in the XSD distribution shows how to do this.
Of course, if the files that you are parsing a quite large, then
the overhead of parsing a prefix in the starter program will be
fairly minor and the multi-program approach may prove more
suitable since you can encapsulate handling of each version into
a standalone executable. This will especially be true if different
schema versions still use the same XML namespace. In this situation,
in order to be able to handle multiple versions in the same program,
you would need to map the XML namespaces of different versions to
different C++ namespaces (see --namespace-map and --namespace-regex
XSD options).
Boris
More information about the xsd-users
mailing list