[xsd-users] dealing with xml written/read on-the-fly
Cerion Armour-Brown
cerion at kestrel.ws
Sun Oct 18 05:01:12 EDT 2009
Hi Boris,
Boris Kolpackov wrote:
> Cerion Armour-Brown <cerion at kestrel.ws> writes:
>> You say it will block if there's no data: when I try the examples out
>> using files, an error is thrown when EOF is read instead of the expected
>> xml. Hence my understanding that one would need to poll to make sure
>> the read will succeed... Am I missing something?
>>
>
> If the stream ends with EOF then the parser assume there is no more
> data available. And if the document is incomplete, then you will get
> a parsing error. In your case, I guess, you will need to provide a
> custom std::istream implementation (or xercesc::InputSource -- that
> could actually be easier) that doesn't return on EOF but instead keeps
> polling the file for more data (e.g., you could save the offset of the
> last byte read, wait some time, re-open the file, seek to that saved
> offset, and see if there is more data). I assume you will need to
> implement this logic somewhere in the application in any case. With
> this approach it will just be in the stream.
>
I had a look at doing this, but this I'm not happy about this direction.
Xerces buffers the file data, and if the buffer gets low, it reads
ahead. This means there may be data available to xerces (in its buffer),
but we're going to block on the file anyway. Plus I would need to take a
look at the data last read from the file (i.e. in xerces buffer, or seek
back in the file), to see if EOF has been reached correctly (closing tag
has been read in).
I find this too sensitive to the underlying xerces implementation (and
more work than I'd hoped for!)
If I can avoid it, I'd prefer not to work with separate threads at all
(the above blocking read solution would need that). I imagined my Qt app
could be the driver, with a loop to pull in the next (few) top level
tags, and then update the GUI, and so on. This simplifies the whole
setup, and keeps Qt in control.
Qt solves this EOF problem by returning an UnexpectedEOF error, but make
this recoverable, so we can continue parsing. From what I understand
from the docs and source code, XSD / Xerces don't (yet) support recovery
from this?
If they do, how is this possible, and is this a way forward?
I don't even see how I can identify the error well - it seems the error
type number isn't propagated, only the message string, and I'm not going
to match on that!
Thanks again for you help thus far, it's appreciated.
Cerion
P.S. Do you have plans to make a xml binder for the Qt parsers? ;-)
More information about the xsd-users
mailing list