[xsd-users] dealing with xml written/read on-the-fly

Cerion Armour-Brown cerion at kestrel.ws
Sun Oct 18 05:01:12 EDT 2009


Hi Boris,

Boris Kolpackov wrote:
> Cerion Armour-Brown <cerion at kestrel.ws> writes:
>> You say it will block if there's no data: when I try the examples out  
>> using files, an error is thrown when EOF is read instead of the expected  
>> xml.  Hence my understanding that one would need to poll to make sure  
>> the read will succeed...  Am I missing something?
>>     
>
> If the stream ends with EOF then the parser assume there is no more
> data available. And if the document is incomplete, then you will get
> a parsing error. In your case, I guess, you will need to provide a
> custom std::istream implementation (or xercesc::InputSource -- that 
> could actually be easier) that doesn't return on EOF but instead keeps
> polling the file for more data (e.g., you could save the offset of the
> last byte read, wait some time, re-open the file, seek to that saved 
> offset, and see if there is more data). I assume you will need to
> implement this logic somewhere in the application in any case. With
> this approach it will just be in the stream.
>   
I had a look at doing this, but this I'm not happy about this direction. 
Xerces buffers the file data, and if the buffer gets low, it reads 
ahead. This means there may be data available to xerces (in its buffer), 
but we're going to block on the file anyway. Plus I would need to take a 
look at the data last read from the file (i.e. in xerces buffer, or seek 
back in the file), to see if EOF has been reached correctly (closing tag 
has been read in).
I find this too sensitive to the underlying xerces implementation (and 
more work than I'd hoped for!)

If I can avoid it, I'd prefer not to work with separate threads at all 
(the above blocking read solution would need that). I imagined my Qt app 
could be the driver, with a loop to pull in the next (few) top level 
tags,  and then update the GUI, and so on. This simplifies the whole 
setup, and keeps Qt in control.

Qt solves this EOF problem by returning an UnexpectedEOF error, but make 
this recoverable, so we can continue parsing. From what I understand 
from the docs and source code, XSD / Xerces don't (yet) support recovery 
from this?
If they do, how is this possible, and is this a way forward?
I don't even see how I can identify the error well - it seems the error 
type number isn't propagated, only the message string, and I'm not going 
to match on that!

Thanks again for you help thus far, it's appreciated.
Cerion

P.S. Do you have plans to make a xml binder for the Qt parsers? ;-)



More information about the xsd-users mailing list