[xsd-users] Serialization and extended ascii characters.

Boris Kolpackov boris at codesynthesis.com
Fri Jun 6 09:08:38 EDT 2008


Hi Jan,

jan.noorland at iff.com <jan.noorland at iff.com> writes:

> My code is build with Visual C++ 2003 and the input data is defined as 
> char. The character is a valid ISO Latin-1 character (á = 0xE1).

Yes, but it is not a valid UTF-8 character. The same character in
UTF-8 will be represented as a 2-byte sequence: 0xC3 0xA1.


> Does it imply that I need to use the --char-type wchar_t parameter to 
> regenerate the XSD code?

You have three options here:

1. Use 'char' as the character type and represent non-ASCII characters as
   proper UTF-8 sequences.

2. Use 'char' as the character type and compile your code with XSD_USE_LCP
   macro defined. If your and all your user's Windows is configured to use
   ISO Latin-1 as an encoding then everything should work.

3. Use 'wchar_t' as the character type (compile your schemas with
   --char-type wchar_t) and use UTF-16 representation for 0xE1
   which is 0x00E1 (L"á" will probably also work).

Boris





More information about the xsd-users mailing list