[xsd-users] Ignoring unknown elements

Fri Oct 17 08:47:00 EDT 2014

Hi Vladimir,

Vladimir Zykov <vladimir.zykov at ncloudtech.ru> writes:

> Why we think changing schema will reduce amount of code is because we support
> a small portion of the original schema. This might change someday but for now
> we better remove non-supported types/elements.

So you are essentially cleaning up the schema of all the elements that
you don't use even though the actual XML might contain them. One thing
that you may consider doing is generate a list of "known" elements and
then filtering the DOM from all the unknown ones. One way to do this
would be to delete them from the DOM document after it has been loaded 
(and before handing it off to XSD-generated code). But if you are after
efficiency, then a better way would be to filter this out while parsing.
I vaguely remember mentioning of filters in the DOM API but I myself
never used them. The other option is to do something similar to what
the 'streaming' example does. With this mechanism you could easily
ignore whole XML subtrees (based on your known element list) without
them ever ending up in DOM. I can't think of a more efficient way
than that.

The only potential problem with the known element list approach is
that the same element name could be "known" in one context (e.g.,
XML Schema type) and "unknown" in another. I would resolve this
by saying that if an element name is known in one place, then
it must be also known in all other places in the schema.

> That's fine. If you think that very few users will need it we can do this
> ourselves. The only thing that is unclear for me now is how to publish our
> changes as demanded by GNU license (XSD is only project with GNU license 
> that we use).

Seeing that you have a proprietary license for XSD, you don't need
to worry about the GPL requirements. If you don't want to publish
your changes, you don't have to.

Boris