[xsd-users] embedding and fgXercesLoadExternalDTD?

Mario Lang mlang at delysid.org
Sun Apr 19 12:06:42 EDT 2015


Hi.

I realize this is more a Xerces-C++ question, but since I stumbled across
this issue while using the embedding example, I hope someone here might
have an answer or at least a hint.

I am using xsdcxx to generate MusicXML bindings[1].  I recently made use of
the cxx/tree/embedding example to avoid having to distribute the XSD
schema in a separate file.  This is a very nice appproach which will
make deployment simpler.  However, MusicXML documents are somewhat
strange as they usually *always* include a DocType definition in the
XML.

So currently, if I enable fgDOMValidate (which I actually *want*),
Xerces-C always fetches the DTD from the net, because the canonical
MusicXML DocType looks like this:

<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 3.0 Partwise//EN"
          "http://www.musicxml.org/dtds/partwise.dtd">

I now discovered fgXercesLoadExternalDTD which looks like what I want.
However, setting it to false does still load the DTD if I have
fgDOMValidate set to true.
Setting fgDOMValidate to false will skip loading the DTD, but will
apparently also not err if there are unknown element names in the input
document.  I am sort of surprised about this, because fgXercesSchema is
set to true and the grammar pool is also setup, so I was sort of
expecting schema validation to kick in instead.

I am sort of confused and stuck here.  I am looking for a way to have
the Schema grammar validate the input document, and still *skip* loading
of the DTD from the net, either by just ignoring it, or by somehow
caching it in the GrammarPool.  However, I do not see how and if the
GrammarPool can also embed DTDs.  As an added complications, the
MusicXML DTD is versioned and comes in two variants (partwise and
timewise),
so the public ID could be "-//Recordare//DTD MusicXML 3.0 Partwise//EN",
but it could also be "-//Recordare//DTD MusicXML 2.0 Partwise//EN" for instance.

1. Can the embedding example be extended to embed several DTD variants?
2. If so, how does the public ID resolving work?  AFAICS, the actual DTD
file(s) do not include the public identifier, so I wonder how the grammar pool
is supposed to know the public ID of a particular DTD it caches.
3. Is there a way to have fgDOMValidate enabled while setting
fgXercesLoadExternalDTD to false and do what I mean?
4. Or do I have to create several grammar pools for each DTD variant,
and use the DOMLSResourceResolver interface somehow to point the parser
at the correct pool?  I guess not, because the grammar pool is specified
when instantiating the lsparser, not during resource resolving.

Any other options I am missing?  I once played with a
DOMLSResourceResolver
that makes use of the libxml2 catalog support, but this doesn't
particularily look portable.  I guess Xerces-C still has no native
catalog support, does it?

[1] http://github.com/mlang/xsdcxx-musicxml

-- 
CYa,
  ⡍⠁⠗⠊⠕



More information about the xsd-users mailing list