[xsd-users] Re: Serialisation issue.

Boris Kolpackov boris at codesynthesis.com
Mon Jan 1 09:51:29 EST 2007


Hi David,


Moss, David R (SELEX Comms) (UK Christchurch) <david.r.moss at selex-comm.com> writes:

> I'm also using 2.7.0.
>
> With the attached files, test-users-2.xml doesn't load:
>
> // Loads.
> auto_ptr<class UserDatabase_t> db( UserDatabase( "test-users.xml" ) );
>
> // Exception thrown.
> auto_ptr<class UserDatabase_t> db2( UserDatabase( "test-users-2.xml" )
> );

I now can reproduce the problem too. I did some debugging and here is
what I've come up with. XML Schema import directives and schemaLocation
attributes "hint" the schema processor which schema files "cover" which
namespace. As a result most (all?) schema processors use namespace
URIs as keys when they try to determine whether a grammar for a
particular namespace has already been loaded.

Here is what happens in your case when noNamespaceSchemaLocation comes
first (as in test-users-2.xml). The parser loads derived-user-config.xsd
which imports test-user-config.xsd for namespace http://www.dave.com/Base.
Then the parser moves on to schemaLocation attribute which specifies
test-users.xsd for namespace http://www.dave.com/Base. Before loading
test-users.xsd, the parser checks whether there is already grammar for
the http://www.dave.com/Base namespace. Since we've already "covered"
http://www.dave.com/Base with test-user-config.xsd, the parser skips
loading test-users.xsd. And everything goes downhill from here.

I tend to think it is a general XML Schema design flaw (or limitation)
rather than a bug in Xerces-C++. The idea is that you need to always
specify a complete, top-level schema file for a namespace in import
declarations and schemaLocation attributes. In your case that would
mean to import test-users.xsd instead of test-user-config.xsd in
derived-user-config.xsd. There are also a couple of alternative
solutions:

1. Use different namespaces for test-users.xsd and test-user-config.xsd.

2. Instead of specifying schemas in instance documents, you can
   programmatically load them into Xerces-C++ grammar cache. This way
   you can make sure that you load schemas in the proper order (i.e.,
   first test-users.xsd, then derived-user-config.xsd). For more
   information on how to do this see the following resources:

   http://www-128.ibm.com/developerworks/webservices/library/x-xsdxerc.html
   http://wiki.codesynthesis.com/Tree/FAQ


> (As an aside, the first shouldn't load either as Number should be
> unique. I'm assuming this is due to xpath="TestUserConfig" not allowing
> for substitution groups.)

Since XML Schema identity constraints are based on XPath and XPath operates
in terms of elements and attributes (instead of types), identity constraints
do not work very well with substitution groups where you change element
names (they work ok with xsi:type, since the element name stays the same).

The only way to make it work with substitution groups is to list all
possible substitutions in the selector:

xpath="b:TestUserConfig|DerivedUserConfig"

(Note, you need to use xpath="b:TestUserConfig" since TestUserConfig is a
qualified element.)

This is not very flexible since you have to hard-code all possible
substitutions. I think the best solution in this case is to scrap XML
Schema identity constraints altogether and implement the check in the
code:

const UserDatabase_t& db = ...
std::set<xml_schema::string> ids;

for (UserDatabase_t::TestUserConfig::const_iterator
     i (db.TestUserConfig ().begin ());
     i != db.TestUserConfig ().end ();
     ++i)
{
  if (ids.find (i->Number ()) != ids.end ())
  {
    // Identity constraint violation.
  }

  ids.insert (i->Number ());
}


hth,
-boris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 652 bytes
Desc: Digital signature
Url : http://codesynthesis.com/pipermail/xsd-users/attachments/20070101/2b59a6e5/attachment.pgp


More information about the xsd-users mailing list