[xsd-users] iso-8859-1 - Serialization Error

Boris Kolpackov boris at codesynthesis.com
Wed Apr 22 07:50:03 EDT 2009


Hi Constantin,

CC'ed xsd-users; see rule #1 in posting guidelines:

http://www.codesynthesis.com/support/posting-guidelines.xhtml

Constantin Iacobescu <sir.costy at gmail.com> writes:

> You have right, but
> I thought that the function
> serializeLandXML (ofs,
>              *objLandXML,
>              infoMap,
>              "iso-8859-1", //"UTF-8",
>              xml_schema::Flags::dont_initialize
>              );
> 
> is also changing the dom encoding.
> 
> my app is not supporting yet unicode string, so the other question is
> How can I tell to the parser what encoding should use?
> I want to set the encoding used for serialization from default to ISO-8859-1
> but unfortunately I don't know how to do that. ina a very simple way.

You are confusing several things here. There can be three unrelated
encodings involved in parsing, accessing/modifying, and serializing
XML with XSD. These are:

1. Character encoding in the document being parsed. This is specified
   in the XML document.

2. Character encoding in the object model. As we have already discussed,
   when char is used as the character type, it is UTF-8 by default and
   can be changed to "local code page" with the XSD_USE_LCP macro.

3. Character encoding in the document being serialized. This is specified
   in the serialization function.

It is perfectly normal for all three to be different. For example, the
document you want to parse is in ISO-8859-1, the object model encoding
is UTF-8, and you serialize the document in UTF-16.

When the document is parsed, the generated code automatically converts
the data from the document encoding (1) to the object model encoding (2).
Similarly, when the document is serialized, the data is converted from
the object model encoding (2) to the output document encoding (3).

Boris




More information about the xsd-users mailing list