[xsd-users] <choice> abuse

Boris Kolpackov boris at codesynthesis.com
Tue Mar 13 15:50:11 EDT 2007


Hi Ray,

Ray Lischner <rlischner at proteus-technologies.com> writes:

> <complexType name="X">
>   <choice>
>     <element name="a" type="int"/>
>     <element name="b" type="float"/>
>     <element name="c" type="string"/>
>   </choice>
> </complexType>
>
> The generated class X has three optional members: a, b, and c. The
> problem is that I can set all three members:
>
> X x;
> x.a(20);
> x.b(3.14);
> x.c("hello");
>
> with the nonsensical result that the "choice" has all three values,
> not only one. I wish the code were more resilient to programmer
> error.

This is part of a bigger issue that all data binding tools have
to deal with. It boils down to whether to recreate the schema
structure with an "unnatural" access API or to "flatten" the
structure and generate an easy to use but sometimes dangerous
API. We decided to go with the flat, easy, and dangerous ;-).
I am not sure this is the right way. I am actually not even sure
there is a right may. It appears to depend heavily on the schema
design, i.e., a document-centric schemas normally have a lot of
structure and would probably be better off with a structured API
while data-centric schemas are normally simpler and do not rely
on relative ordering of elements, etc. In this case the flat API
is probably a better choice.


> Or the implementation of <choice> could switch to a union, with
> a generated enumeration to specify which union member is valid.

Yes, this is one possibility and it will work pretty well for the
example you gave. But consider something like this (quite a common
pattern, BTW):

<complexType name="email">
  <sequence>
    <choice maxOccurs="unbounded">
      <element name="to" type="string"/>
      <element name="cc" type="string"/>
      <element name="bcc" type="string"/>
    </choice>
    <element name="subject" type="string"/>
    <element name="body" type="string"/>
  </sequence>
</complexType>

Here, <choice> is used to say "at least on of to|cc|bcc in any
order". Let's see how this can be mapped to a union. Because
we have a sequence of choices (maxOccurs="unbounded"), we will
need to generate a nested type for this compositor, something
along these lines:

struct email
{
  struct choice_element_type
  {
    enum tag
    {
      to_tag,
      cc_tag,
      bcc_tag
    };

    tag which ();

    string to ();
    string cc ();
    string bcc ();
  };

  typedef sequence<choice_element_type> choice_type;

  choice_type& choice ();

  string subject ();
  string body ();

  ...

};

Note how we had to invent a lot of names that are not really found
in the schema -- they are all candidates for name conflicts. While
this API preserve the structure of the content, the flat API is
probably preferable in this case since all one needs is a list
of to's, cc's and bcc's.


> Meanwhile, I'm open to suggestions for the best way to ensure
> that programmers do not inadvertently create invalid objects.

At the moment there is no way to enforce this with the interface
so the only way I can see is to educate them about the limitations.

If there was a possibility to choose between structured and flat
APIs, would you switch to the structured even if it complicates
your code quite a bit?


thanks,
-boris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 652 bytes
Desc: Digital signature
Url : http://codesynthesis.com/pipermail/xsd-users/attachments/20070313/808c6e79/attachment.pgp


More information about the xsd-users mailing list