[xsd-users] Issue using proxy for Open Packaging Conventions core-properties schema

Rob Ursem Rob.Ursem at cmgl.ca
Tue Jun 25 15:48:19 EDT 2013


Boris,

I've been able to successfully customize the SimpleLiteral type and get the text element from that. 
Thanks to the clear write up and examples!

It didn't help me very much though since with the next element I got stuck again.
The opc-coreProperties schema uses the following definition for CoreProperties:

<xs:complexType name="CT_CoreProperties">
    <xs:all>
      <xs:element ref="dcterms:created" minOccurs="0" maxOccurs="1" />
      <xs:element ref="dc:creator" minOccurs="0" maxOccurs="1" />
      <xs:element ref="dc:description" minOccurs="0" maxOccurs="1" />
      <xs:element ref="dc:identifier" minOccurs="0" maxOccurs="1" />
      <xs:element name="keywords" minOccurs="0" maxOccurs="1" type="xs:string" />
      <xs:element ref="dc:title" minOccurs="0" maxOccurs="1" />
      <xs:element name="version" minOccurs="0" maxOccurs="1" type="xs:string" />
    </xs:all>
</xs:complexType>

In dc.xsd the following is defined
<xs:schema>

  ...
  
  <xs:element name="any" type="SimpleLiteral" abstract="true"/>
  
  <xs:element name="title" substitutionGroup="any"/>
  <xs:element name="creator" substitutionGroup="any"/>
  <xs:element name="description" substitutionGroup="any"/>
  <xs:element name="identifier" substitutionGroup="any"/>

  ...

</xs:schema>

XSD maps title, creator, description and identifier onto 
    // identifier
    // 
    typedef ::xml_schema::type identifier_type;

Thus far I have not been able to customize xml_schema::type.

I thought that since there is a definition of the "any" element (of type SimpleLiteral) 
that any element in the "any" substitution group would be of the SimpleLiteral type. 
This would make my customization work like a charm. Instead it seems that XSD
is interpreting substitutionGroup="any" as the "xs:any" type.

My short term work around (I've spent several days on what should be trivial)
is to replace substitutionGroup="any" by type="SimplLiteral" in dc.xsd.
This makes all the code working and now my tests pass.
But it still doesn't feel right.

Is there anything else I have missed (or another manual page / example I should have
looked at?) :-)

Thanks for all your help. This is only the first step of my evaluation as about 220 XML objects
await my interpretation next.

Regards,
Rob

-----Original Message-----
From: Boris Kolpackov [mailto:boris at codesynthesis.com] 
Sent: Tuesday, June 25, 2013 5:38 AM
To: Rob Ursem
Cc: xsd-users at codesynthesis.com
Subject: Re: [xsd-users] Issue using proxy for Open Packaging Conventions core-properties schema

Hi Rob,

Rob Ursem <Rob.Ursem at cmgl.ca> writes:

> As for the W3CDTF W3CDTF is a SimpleLiteral type which is a _type.

I looked up the definition of SimpleLiteral and it is a restriction of anyType, which in XSD is mapped to xml_schema::type.


> Neither W3CDTF nor SimpleLiteral provide access to the text 
> representation of the element as W3CDTF contains no variables or 
> attributes and SimpleLiteral only contains the (optional) lang 
> attribute.

Yes, anyType can contain any content (any mixture of text and elements). In fact the definition of SimpleLiteral is quite bizarre in that they take anyType and restrict it to only allowing text content. Why not just use string as a base type then?


> There are accessors on _type for _node() which would get me to
> _node()->getTextContent() but the issue is that _node() returns a 
> nullptr for my W3CDTF instance.

This will work if you enable DOM association during parsing. See Section 5.1, "DOM Association" in the C++/Tree Mapping User Manual:

http://www.codesynthesis.com/projects/xsd/documentation/cxx/tree/manual/#5.1


> Perhaps there is an option that allows me to get the raw text for the 
> element so I could provide my own interpretation.

Yes, DOM association is that option.


> It seems counter to the philosophy of the library but is that what you 
> meant with "customize the generated type"? I can see I can modify the 
> generated code but I would rather not since there is the option to 
> re-generate it. I haven't tried deriving off the generated code since 
> generally the rest of the generated code does not instantiate or use 
> the derived code.

No, XSD provides proper support for type customization which doesn't require you to modify the generated code.

For more information on type customization see the C++/Tree Mapping Customization Guide:

http://wiki.codesynthesis.com/Tree/Customization_guide

As well as the examples in the examples/cxx/tree/custom/ directory.
In particular, the 'mixed' example shows how to do pretty much what you want.

Using this approach you could customize the SimpleLiteral type to contain/return the string you are interested in. Or you could customize W3CDTF to contain/return a date-time value. Or both.

The advantages of the type customization approach compared to DOM association is that the client code can use clean API (no need to know anything about DOM). Plus, it works for both parsing and serialization (DOM association only works for parsing).

Boris



More information about the xsd-users mailing list