[xsd-users] Poor performance of Unicode conversion

Ray Lischner rlischner at proteus-technologies.com
Tue Oct 2 09:32:41 EDT 2007


> The test parses the above 300Kb document in about 15ms. As you can see,
> the UTF-16 to UTF-8 conversion only takes 13% of the time. So even if we
> eliminate it completely, the overall speedup can't be by more than 13%.

One person's "only" is another person's "as much as". We see similar results, but from our point of view, if Xerces were miraculously, instantly changed to use UTF-8, that 13% would vanish instantly. It is pure waste. I realize that the waste is the result of a design decision by Xerces and entirely our of your control.
 
> The optimizations are for XSD 3.0.0 and will appear in the next release.
> I can also backport them to 2.3.1 if anybody is interested but cannot
> upgrade to 3.0.0.

Thanks, but because we've already modified the code base, I think merging your optimizations might be harder than switching to Xerces 2.8.0. From what I've read, I think Xerces 2.8.0 will give us the biggest benefit/effort right now.
 
Thank you for your help. I see from the Xerces-C web site that we have you to thank for many of the Xerces 2.8.0 improvements, too.
--
Ray Lischner, Proteus Technologies LLC



More information about the xsd-users mailing list