[xsd-users] import, include, namespaces, restriction and schema versioning

Thu Aug 27 12:15:30 EDT 2009

Hi Eric,

Eric Niebler <eric at boostpro.com> writes:

> > Is it going to be done as a reaction to a validation during serialization
> > error or proactively before the validation?
>
> Well, the idea is that the element will be optional, so it's absence on  
> read will not be a schema violation. But I've already told you that, so  
> I feel like I must not be understanding your question. Can you clarify?
>
> In our tool, after we (de-)serialize from XML to DOM, we walk the DOM  
> and find the missing elements and fill in defaults. Does that clear it 
> up?

Yes, I was thinking about these two options: you can either do it
proactively like you do now or it can theoretically be done as a
reaction to validation during serialization errors.

> Ideally, we would mark up the xsd and have CodeSynth fill in the  
> defaults for us, but that's something else entirely.

Yes, it seems there are two general ways in which the user may want
this to be handled:

(1) Fill in the defaults for missing elements.

(2) Notify the user somehow about missing elements, presumably, 
    during parsing.

> > On the surface validation during serialization often seem like a good
> > idea. However, once you start thinking about what to do in case of
> > an error, its usefulness becomes questionable, except, maybe, for
> > debugging
>
> I disagree, and this is where I see a situation where CodeSynth can  
> provide a value-add over plain mapping and validation. Imagine that we  
> have optional elements with special CodeSynth markup for what the value  
> the element should take when it's missing. If CodeSynth did a schema  
> validation pass on serialization, this is where the value for missing  
> elements could be filled in automatically.

Wouldn't this be more naturally done during parsing so that the
application developer doesn't have to worry about missing elements?

> That is, CodeSynth can proactively correct simple schema violations 
> with a little guidance from the xsd author.

Such things can be implemented quite easily either during parsing
or during serialization. And I agree, they can be useful. It is the
full XML Schema validation that, IMO, is not very useful since it
is not clear what to do when there is an error.

> If the issue is merely one of error detection and reporting, then yes I  
> agree that validation on serialization doesn't make much sense except  
> for debugging. But if you're willing to consider error correction, then  
> it does make sense.

I think this will only work for automatic error correction, as above.
In case the user intervention is required, it is not clear how to 
supply the error information (e.g., position, description, etc.) that
can be usable for anything other than an error message. 

> > Full-blown XSLT will also work and is probably simpler and quicker
> > to implement (especially if you have multiple schema files connected
> > via include/import). But the DOM approach is tidier since you don't
> > need to carry two sets of schemas with your application.
>
> Confused. If I use XSLT to process the schema, then I can still ship  
> only 1 set of schema, right?

If you have a full-blown XSLT processor inside your application, then 
yes, you can ;-).

> So maybe xse:writeRequired isn't quite what I want. I'd like a way to  
> provide default values for optional elements in a way that they are  
> filled in automatically on read; and on write, either (a) fill them in  
> automatically, or (b) flag their absence as an error.

If they are filled in on read, why do you also need this on write?
Perhaps the application can create the object model and not provide
new elements (because you generate default c-tors)? Is that the use
case you had in mind?

> > But I feel that it is only a part of the solution. The user of the
> > mapping still has to detect the missing elements and provide some
> > default values manually. 
>
> We're already doing that in our tool, so that's ok.

That sounds like quite a tedious task. I wonder what API can we use
to notify the application about missing elements and request the
default values..? We could use DOM to return the default values
but returning an object model instance would be more convenient.
We could also pass the reference to the containing object model
node in case the handler needs it. 

One wild idea is to specify the handler function in the schema
file, for example:

  <complexType name="person">
    <sequence>
      <element name="name" type="string"/>
      <element name="age" type="int" minOccurs="0" 
               xse:defaultHandler="person_age_value"/>
    </sequence>
  </complexType>

The signature of the person_age_value() function would be:

int person_age_value (person& p);

Then we can do something like:

std::map<string, int> ages = ...

int person_age_value (person& p)
{
  return ages[p.name ()];
}

We would then have the companion attribute, xse:defaultValue, which
is used to specify the default value in some form. Maybe an XPath to
an element in the "default values file", for example:

  <complexType name="person">
    <sequence>
      <element name="name" type="string"/>
      <element name="age" type="int" minOccurs="0" 
               xse:defaultValue="/values/person/age"/>
    </sequence>
  </complexType>

The remaining question is how the values from this file are 
accessed to construct object model nodes. The cleanest approach
would be to somehow embed these values directly into the generated
code. Ideally, we would have a static object model instance that
represents the default value. But it is not clear how to initialize
it (Xerces-C++ is not usable during static initialization).

Boris