[xsd-users] dealing with xml written/read on-the-fly

Wed Oct 14 02:50:46 EDT 2009

Boris, thanks for the quick response...

Boris Kolpackov wrote:
> Hi Cerion,
>
> Cerion Armour-Brown <cerion at kestrel.ws> writes:
>> I've been reading up on XSD - nice toolset!    
> Thanks, glad you like it.
>   
>> I'm proposing to use C++/Tree as my memory model, but I don't see a  
>> simple way to parse (and display) the incoming xml chunks as they become  
>> available.    
> I was planning to come up with an example that shows how to do stream-
> oriented, partially in-memory parsing and serialization with C++/Tree
> for the next release. But seeing that you are looking for something
> like this, I went ahead and implemented it. Here is the excerpt from
> the README file:
>
> "This example shows how to perform stream-oriented, partially in-memory 
>  XML processing using the C++/Tree mapping. With the partially in-memory 
>  parsing and serialization only a part of the object model is in memory at
>  any given time. With this approach we can process parts of the document
>  as they become available as well as handle documents that are too large
>  to fit into memory."
>
> I also backported this example to XSD 3.2.0:
>
> http://www.codesynthesis.com/~boris/tmp/xsd-3.2.0-streaming.tar.gz
> http://www.codesynthesis.com/~boris/tmp/xsd-3.2.0-streaming.zip
>
> It replaces the examples/cxx/tree/streaming example in XSD 3.2.0 which
> only shows the serialization part. So to use this example with 3.2.0,
> remove the examples/cxx/tree/streaming/ directory and then copy the
> content of one of the above archives into your XSD distribution
> directory.  
I'll take a proper look as soon as I can, but this does look interesting...
Not quite clear on one point tho: I see the current example reads in 
chunks and holds in memory the latest chunk... but I'd need to build up 
a complete model, reading in chunk by chunk. Not sure if this is a 
simple step to take.
And if that is possible, I guess I'd then poll the input file for 
changes, and call parser->next() if there's anything new...

Background: A valgrind process can be running for hours (or even days!), 
hence the need for the user to see what's happening as it happens: don't 
want to wait for the process to end before finding out that something 
bad happened after 5mins and we should have killed it :-)

>> Would I have to use C++/Parser to build the Tree as complete events are  
>> read in?
>>     
> C++/Parser is inherently stream-oriented. That is, parsing will proceed 
> and the callbacks will be called as data becomes available. C++/Tree is
> inherently in-memory, meaning you won't get to the data until the whole
> document is parsed. But, as the example above shows, C++/Tree can be used
> in a hybrid mode which is often exactly what you want since it doesn't
> require any manual coding (unlike C++/Parser, where you have to implement 
> callbacks) and gives you all the benefits.
>   
>> From my experiments, this looks plausible, but Parser still wants to read 
>> in the whole xml file in one go.
>>     
> Not exactly. I think what you mean is that it doesn't return control until 
> the whole file is parsed. With C++/Parser you would trigger displaying your
> data from one of the callback functions.
>   
You're absolutely correct... I was approaching this from the Qt pov.  So 
if I went this route, I'd need to run the C++/Parser within a separate 
thread... fair enough.

>> All advice, pointers, code examples!, etc would be very welcome!    
> Let us know if you can come up with something based on the streaming
> example above.
>
> Boris  

Thanks for your help,
Cerion