[xsd-users] Deleting empty elements

delta42 delta42 at gmail.com
Tue Jan 27 13:39:52 EST 2009


> You can serialize the object model to the DOM document first. This
> way you can traverse the document and remove all empty elements.
> After that you can serialize the DOM document to stringstream
> (there is sample code in the FAQ[1] on how to do this).


Thank you very much Boris, you set me on the right track.

I did not find a definition for DomWriter in my* \CodeSynthesis XSD
3.2\include\xercesc* folders so I adapted the sample code to work with the
DOMLSSerializer, as such:

SerializeDOMtoXMLFormatTarget(xercesc::XMLFormatTarget& target,
                                                 const xercesc::DOMDocument&
doc,
                                                 const std::string&
encoding/*= "UTF-8"*/)
{
    using namespace xercesc;
    namespace xml = xsd::cxx::xml;
    namespace tree = xsd::cxx::tree;

    const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};

    // Get an implementation of the Load-Store (LS) interface.
    //
    DOMImplementation*
impl(DOMImplementationRegistry::getDOMImplementation(ls_id));

    // Create a DOMLSSerializer.
    //
    xml::dom::auto_ptr<DOMLSSerializer> writer(impl->createLSSerializer ());

    DOMConfiguration* conf(writer->getDomConfig ());

    // Set error handler.
    //
    tree::error_handler<char> eh;
    xml::dom::bits::error_handler_proxy<char> ehp (eh);
    conf->setParameter(XMLUni::fgDOMErrorHandler, &ehp);

    // Set some nice features if the serializer supports them.
    //
    if (conf->canSetParameter(XMLUni::fgDOMWRTDiscardDefaultContent, true))
        conf->setParameter(XMLUni::fgDOMWRTDiscardDefaultContent, true);
    if (conf->canSetParameter(XMLUni::fgDOMWRTFormatPrettyPrint, true))
        conf->setParameter (XMLUni::fgDOMWRTFormatPrettyPrint, true);
    if (conf->canSetParameter(XMLUni::fgDOMXMLDeclaration, true))
        conf->setParameter(XMLUni::fgDOMXMLDeclaration, true);

    xml::dom::auto_ptr<DOMLSOutput> out(impl->createLSOutput());
    out->setEncoding(xml::string (encoding).c_str());
    out->setByteStream(&target);

    bool ret = writer->write(&doc, out.get());

    eh.throw_if_failed<tree::serialization<char> > ();
}

and this work very well so far (I don't really care if the final output is a
MemBufFormatTarget or a stringstream, I just need access to the bytes, as in
void*).

Thanks also for the remove_empty_elements, this worked perfectly. The only
changes I made is to have remove_empty_elements return the number of
elements it changed, and I added another function:

remove_all_empty_elements
{
    while (remove_empty_elements(e));
}

Like this I can iterate to remove the empty items that have been revealed by
the previous removal pass.

For example, my current example looks like this (after renaming my tags):

   <a>
      <b>
        <c>
          <d>
            <e/>
          </d>
        </c>
        <f/>
        <g>
          <h/>
          <i/>
        </g>
      </b>
    </a>

After the first pass I have:

   <a>
      <b>
        <c>
          <d>
          </d>
        </c>
        <g>
        </g>
      </b>
    </a>

then:

   <a>
      <b>
        <c>
        </c>
      </b>
    </a>

etc., etc., until this whole node disappears in 3 more passes.

I'm not sure if there's a faster/better way to do this or not, but it is not
taking any "noticeable" time on my PC, so this is good. Validation is
another story for me, I find it takes forever, as in 10 seconds (E6600
processor/WinXP), but I suspect most of that time is parsing my huge amount
of schemas, so when I get a chance I may cache the schemas in advance, or
even implement the binary grammar object that I read about on your forum
lately.

Thank you once again Boris for your dedication to this wonderful project,
delta42.



More information about the xsd-users mailing list