[xsd-users] Poor performance of Unicode conversion
Ray Lischner
rlischner at proteus-technologies.com
Tue Oct 2 09:32:41 EDT 2007
> The test parses the above 300Kb document in about 15ms. As you can see,
> the UTF-16 to UTF-8 conversion only takes 13% of the time. So even if we
> eliminate it completely, the overall speedup can't be by more than 13%.
One person's "only" is another person's "as much as". We see similar results, but from our point of view, if Xerces were miraculously, instantly changed to use UTF-8, that 13% would vanish instantly. It is pure waste. I realize that the waste is the result of a design decision by Xerces and entirely our of your control.
> The optimizations are for XSD 3.0.0 and will appear in the next release.
> I can also backport them to 2.3.1 if anybody is interested but cannot
> upgrade to 3.0.0.
Thanks, but because we've already modified the code base, I think merging your optimizations might be harder than switching to Xerces 2.8.0. From what I've read, I think Xerces 2.8.0 will give us the biggest benefit/effort right now.
Thank you for your help. I see from the Xerces-C web site that we have you to thank for many of the Xerces 2.8.0 improvements, too.
--
Ray Lischner, Proteus Technologies LLC
More information about the xsd-users
mailing list