[xsd-users] import, include, namespaces, restriction and schema versioning

Mon Aug 31 12:45:07 EDT 2009

Boris Kolpackov wrote:
> Eric Niebler <eric at boostpro.com> writes:
>> Ideally, we would mark up the xsd and have CodeSynth fill in the  
>> defaults for us, but that's something else entirely.
> 
> Yes, it seems there are two general ways in which the user may want
> this to be handled:
> 
> (1) Fill in the defaults for missing elements.

Yes.

> (2) Notify the user somehow about missing elements, presumably, 
>     during parsing.

I wasn't thinking of that. By "user" you mean me, the user of CodeSynth, 
right? And by "notify" you mean invoke a registered callback. Huh, could 
be useful, but is beyond my basic needs.

>>> On the surface validation during serialization often seem like a good
>>> idea. However, once you start thinking about what to do in case of
>>> an error, its usefulness becomes questionable, except, maybe, for
>>> debugging
>> I disagree, and this is where I see a situation where CodeSynth can  
>> provide a value-add over plain mapping and validation. Imagine that we  
>> have optional elements with special CodeSynth markup for what the value  
>> the element should take when it's missing. If CodeSynth did a schema  
>> validation pass on serialization, this is where the value for missing  
>> elements could be filled in automatically.
> 
> Wouldn't this be more naturally done during parsing so that the
> application developer doesn't have to worry about missing elements?

More naturally? Why? My intention is that the resulting XML produced by 
our tools is always in the most up-to-date form and doesn't need to be 
read back in in order to be patched up.

But the fix-up-on-write thought is beyond my basic needs. For my 
scenario, what I'd like is:

1) On write, assert when a writeRequired element is missing.

2) On read, when a writeRequired element is missing, fill in a default 
value.

>> That is, CodeSynth can proactively correct simple schema violations 
>> with a little guidance from the xsd author.
> 
> Such things can be implemented quite easily either during parsing
> or during serialization. And I agree, they can be useful. It is the
> full XML Schema validation that, IMO, is not very useful since it
> is not clear what to do when there is an error.

OK, understood.

>> If the issue is merely one of error detection and reporting, then yes I  
>> agree that validation on serialization doesn't make much sense except  
>> for debugging. But if you're willing to consider error correction, then  
>> it does make sense.
> 
> I think this will only work for automatic error correction, as above.
> In case the user intervention is required, it is not clear how to 
> supply the error information (e.g., position, description, etc.) that
> can be usable for anything other than an error message. 

Right. It's ok, I probably don't need this.

>>> Full-blown XSLT will also work and is probably simpler and quicker
>>> to implement (especially if you have multiple schema files connected
>>> via include/import). But the DOM approach is tidier since you don't
>>> need to carry two sets of schemas with your application.
>> Confused. If I use XSLT to process the schema, then I can still ship  
>> only 1 set of schema, right?
> 
> If you have a full-blown XSLT processor inside your application, then 
> yes, you can ;-).

We may need it anyway.

>> So maybe xse:writeRequired isn't quite what I want. I'd like a way to  
>> provide default values for optional elements in a way that they are  
>> filled in automatically on read; and on write, either (a) fill them in  
>> automatically, or (b) flag their absence as an error.
> 
> If they are filled in on read, why do you also need this on write?

Because I wouldn't want my tool to produce XML that is missing 
writeRequired elements. I may decide to publish the current schema 
(minus CodeSynth extensions like xse:writeRequired or xse:refType) to 
allow 3rd parties utilities to consume the XML my tool produces.

But asserting on the absence of a writeRequired elements achieves this 
end just as well.

> Perhaps the application can create the object model and not provide
> new elements (because you generate default c-tors)? Is that the use
> case you had in mind?

No, see above.

>>> But I feel that it is only a part of the solution. The user of the
>>> mapping still has to detect the missing elements and provide some
>>> default values manually. 
>> We're already doing that in our tool, so that's ok.
> 
> That sounds like quite a tedious task. I wonder what API can we use
> to notify the application about missing elements and request the
> default values..? We could use DOM to return the default values
> but returning an object model instance would be more convenient.
> We could also pass the reference to the containing object model
> node in case the handler needs it. 

I was hoping it could be as simple as something like this:

<xsd:compledType name="MyType">
   ...
</xsd:compledType>

<xsd:element name="MyTypeDefault" type="MyType">
   ...
</xsd:element>

<xsd:complexType> name="SomeOtherType">
   <xsd:sequence>
     <xsd:element name="foo"
                  type="MyType"
                  minOccurs="0"
                  xse:writeRequired="true"
                  xse:defaultValue="MyTypeDefault"/>
   </xsd:sequence>
<xsd:complexType>

> One wild idea is to specify the handler function in the schema
> file, for example:
> 
>   <complexType name="person">
>     <sequence>
>       <element name="name" type="string"/>
>       <element name="age" type="int" minOccurs="0" 
>                xse:defaultHandler="person_age_value"/>
>     </sequence>
>   </complexType>
> 
> The signature of the person_age_value() function would be:
> 
> int person_age_value (person& p);
> 
> Then we can do something like:
> 
> std::map<string, int> ages = ...
> 
> int person_age_value (person& p)
> {
>   return ages[p.name ()];
> }
> 
> We would then have the companion attribute, xse:defaultValue, which
> is used to specify the default value in some form. Maybe an XPath to
> an element in the "default values file", for example:
> 
>   <complexType name="person">
>     <sequence>
>       <element name="name" type="string"/>
>       <element name="age" type="int" minOccurs="0" 
>                xse:defaultValue="/values/person/age"/>
>     </sequence>
>   </complexType>
> 
> The remaining question is how the values from this file are 
> accessed to construct object model nodes. The cleanest approach
> would be to somehow embed these values directly into the generated
> code. Ideally, we would have a static object model instance that
> represents the default value. But it is not clear how to initialize
> it (Xerces-C++ is not usable during static initialization).

Whoa, you're going WAY beyond anything I had in mind. :-) Pretty cool, 
but feels over-engineered for my purposes. Could it really be accomplished?

-- 
Eric Niebler
BoostPro Computing
http://www.boostpro.com