[xsd-users] dealing with xml written/read on-the-fly
Boris Kolpackov
boris at codesynthesis.com
Thu Nov 5 09:02:23 EST 2009
Hi Cerion,
Cerion Armour-Brown <cerion at kestrel.ws> writes:
> > What actually happens is this: if the raw character buffer has less
> > than 100 bytes when Xerces-C++ tries to transcode the next batch of
> > characters, then it will try to read some more. There is actually a
> > technical reason for this other than efficiency (it has to do with
> > multi-byte encodings and the buffer containing only some of the bytes
> > constituting a code point).
>
> Indeed.
>
> > Because Xerces-C++ won't keep trying to read more if the stream returned
> > less than 100 bytes, one way to mitigate this would be to return the
> > data from InputSource::readBytes() in small chunks. If you return it
> > one byte at a time, there will be no buffering at all.
>
> Eugh - that's horrible! :-)
A quick update: I have fixed this issue for the upcoming Xerces-C++ 3.1.0
(should be out in about a month). Now it is possible to change this "low
water mark" for each parser instance. Setting it to 0 disables buffering
altogether. For details, see:
https://issues.apache.org/jira/browse/XERCESC-1607
Boris
More information about the xsd-users
mailing list