[xsd-users] Using XSD to validate and process documents without namespaces specified in top level elements

Karl Mutch karlmutchlists at gmail.com
Thu Aug 21 00:58:53 EDT 2008


HI,

I have an issue where I am trying to parse documents that are coming from a
third party. These have none of the normal namespace attributes and I would
like to parseinput sources in such a way as they can be validated using my
own xsd files.

I have tried a large number of approaches and am now running, or rather
failing, with something similar to the following :


namespace
{
    std::string xml_get_next_lane_response_schema_data("<?xml
version=\"1.0\" encoding=\"utf-8\"?>\n\
<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" xmlns:pp=\"
http://www.enterprise.com/GetNextLaneResponse\" xmlns=\"
http://www.enterprise.com/GetNextLaneResponse\" targetNamespace=\"
http://www.enterprise.com/GetNextLaneResponse\"
elementFormDefault=\"unqualified\" attributeFormDefault=\"unqualified\">\n\
    <xs:element name=\"request\">\n\
        <xs:complexType>\n\
            <xs:sequence maxOccurs=\"1\">\n\
                <xs:element name=\"message\" minOccurs=\"1\"
maxOccurs=\"1\">\n\
                    <xs:complexType>\n\
                        <xs:simpleContent>\n\
                            <xs:extension base=\"xs:string\">\n\
                                <xs:attribute name=\"status\"
type=\"xs:string\" use=\"required\" />\n\
                            </xs:extension>\n\
                        </xs:simpleContent>\n\
                    </xs:complexType>\n\
                </xs:element>\n\
                <xs:element name=\"data\" minOccurs=\"0\"
maxOccurs=\"1\">\n\
                    <xs:complexType>\n\
                        <xs:sequence>\n\
                            <xs:element name=\"Lane\"
type=\"xs:unsignedInt\" minOccurs=\"1\" maxOccurs=\"1\" />\n\
                        </xs:sequence>\n\
                    </xs:complexType>\n\
                </xs:element>\n\
            </xs:sequence>\n\
            <xs:attribute name=\"task\" type=\"xs:string\" use=\"required\"
/>\n\
        </xs:complexType>\n\
    </xs:element>\n\
</xs:schema>\n");

    xsd::cxx::xml::string
inputGrammar(xml_get_next_lane_response_schema_data);

    bool
    Initialize()
    {
        // For performance reasons, we would like to initialize/terminate
        // Xerces-C++ ourselves once instead of letting API functions do
        // it potentially continously during processing.
        //
        xercesc::XMLPlatformUtils::Initialize ();

        return(true);
    }

    /* USED */
    bool fInitialized = Initialize();

    using namespace enterprise;

    class LocalResolver : public xercesc::DOMEntityResolver
    {
    public:
        xercesc::DOMInputSource *resolveEntity(const   XMLCh* const
publicId,
                                      const XMLCh* const    systemId,
                                      const XMLCh* const    baseURI)
        {
            return(new xercesc::Wrapper4InputSource (new
xercesc::MemBufInputSource(reinterpret_cast<unsigned char const
*>(xml_get_next_lane_response_schema_data.c_str ()),
xml_get_next_lane_response_schema_data.size (),
"GetNextLaneResponse.xsd")));
        }

    };

    // Throws exceptions that are expected to be handled by callers !
    std::auto_ptr<enterprise::subsystem::GetNextLaneResponse::request>
    ParseDocument(std::istream &inputStream)
    {
        using namespace xercesc;
        namespace xml = xsd::cxx::xml;
        namespace tree = xsd::cxx::tree;

        const XMLCh ls_id [] = {chLatin_L, chLatin_S, chNull};

        // Get an implementation of the Load-Store (LS) interface.
        //
        DOMImplementation* impl (
          DOMImplementationRegistry::getDOMImplementation (ls_id));

        // Create a DOMBuilder.
        //
        xml::dom::auto_ptr<DOMBuilder> parser (
          impl->createDOMBuilder(DOMImplementationLS::MODE_SYNCHRONOUS, 0));

        // Discard comment nodes in the document.
        //
        parser->setFeature (XMLUni::fgDOMComments, false);

        // Enable datatype normalization.
        //
        parser->setFeature (XMLUni::fgDOMDatatypeNormalization, true);

        // Do not create EntityReference nodes in the DOM tree. No
        // EntityReference nodes will be created, only the nodes
        // corresponding to their fully expanded substitution text
        // will be created.
        //
        parser->setFeature (XMLUni::fgDOMEntities, false);

        // Perform namespace processing.
        //
        parser->setFeature (XMLUni::fgDOMNamespaces, true);

        // Do not include ignorable whitespace in the DOM tree.
        //
        parser->setFeature (XMLUni::fgDOMWhitespaceInElementContent, false);

        // Enable validation.
        //
        parser->setFeature (XMLUni::fgDOMValidation, true);
        parser->setFeature (XMLUni::fgXercesSchema, true);
        parser->setFeature (XMLUni::fgXercesSchemaFullChecking, false);

        parser->setProperty
(XMLUni::fgXercesSchemaExternalNoNameSpaceSchemaLocation,
            const_cast<void*> (
                static_cast<const void*> (
                xml::string ("GetNextLaneResponse.xsd").c_str ())));

        parser->setProperty (XMLUni::fgXercesSchemaExternalSchemaLocation,
            const_cast<void*> (
                static_cast<const void*> (
                xml::string
("http://www.enterprise.com/GetNextLaneResponseGetNextLaneResponse.xsd").c_str
())));

        // Initialize the schema cache.
        //    virtual Grammar* loadGrammar(const DOMInputSource& source,
const short grammarType, const bool toCache = false) = 0;

        //std::istringstream
input_grammar(xml_get_next_lane_response_schema_data);
        //xml::sax::std_input_source grammar_wrapper(input_grammar);

        //xml::string inputGrammar(xml_get_next_lane_response_schema_data);
        xercesc::MemBufInputSource grammar_wrapper
(reinterpret_cast<unsigned char const
*>(xml_get_next_lane_response_schema_data.c_str ()),
xml_get_next_lane_response_schema_data.size (), "GetNextLaneResponse.xsd");
        xercesc::Wrapper4InputSource grammar_input_wrapper
(&grammar_wrapper, false);

        parser->loadGrammar (grammar_input_wrapper,
Grammar::SchemaGrammarType, true);
        parser->setFeature (XMLUni::fgXercesUseCachedGrammarInParse, true);
parser->set
        // We will release the DOM document ourselves.
        //
        parser->setFeature (XMLUni::fgXercesUserAdoptsDOMDocument, true);

        // Set error handler.
        //
        tree::error_handler<char> eh;
        xml::dom::bits::error_handler_proxy<char> ehp (eh);
        parser->setErrorHandler (&ehp);

        // Set the entity resolver
        LocalResolver localResolver;
        parser->setEntityResolver(&localResolver);
        // Wrap the standard input stream.
        //
        xml::sax::std_input_source isrc(inputStream,
"GetNextLaneResponse.xsd");
        Wrapper4InputSource wrap (&isrc, false);

wrap.setSystemId(xml::transcode_to_xmlch("GetNextLaneResponse.xsd"));

        // Parse XML to DOM.
        //
        xml::dom::auto_ptr<xercesc_2_8::DOMDocument> doc (parser->parse
(wrap));
        eh.throw_if_failed<tree::parsing<char> > ();

        xml_schema::properties properties;
        properties.schema_location("
http://www.enterprise.com/GetNextLaneResponse", "GetNextLaneResponse.xsd");
        properties.no_namespace_schema_location("GetNextLaneResponse.xsd");

        // Parse DOM to the object model.
        //

return(std::auto_ptr<enterprise::subsystem::GetNextLaneResponse::request>
            (enterprise::subsystem::GetNextLaneResponse::request_ (
                *doc, xml_schema::flags::keep_dom |
xml_schema::flags::own_dom, properties)));
    }   // end of ... ParseDocument(std::istream &inputStream)
};

more code and then I push the following through the parser :

<?xml version="1.0" encoding="utf-8"?>
<request task="Monitor">
    <message status="12">A Message</message>
</linx>

It bails with the following

Schema in GetNextLaneResponse.xsd has a different target namespace from the
one specified in the instance document ."

Obviously because I cannot force the target namespace. So I turned off NS
Processing using

parser->setFeature (XMLUni::fgDOMNamespaces, false);

And get an "Unknown element" error for the request tag.

If I absolutely know the schema that would work is there a way I can cause
this to work and resolve my elements without mangling the input document ?

Thanks
karl

P.S. I have read the following and don't seem to get any joy from them.

http://www.codesynthesis.com/pipermail/xsd-users/2007-February/000796.html
http://www.codesynthesis.com/pipermail/xsd-users/2006-September/000535.html

My intent would be to use the code from
http://wiki.codesynthesis.com/Tree/FAQ#How_do_I_specify_a_schema_location_other_than_in_an_XML_document.3Fto
identify the correct schema in time.



More information about the xsd-users mailing list