Running XPath on a C++/Tree object model

One interesting feature of the C++/Tree mapping in XSD is the ability to maintain an association between C++ object model nodes and corresponding DOM nodes. Consider the following XML document as an example:

<p:directory xmlns:p="http://www.example.com/people"
  <person>
    <first-name>John</first-name>
    <last-name>Doe</last-name>
    <gender>male</gender>
    <age>32</age>
  </person>
 
  <person>
    <first-name>Jane</first-name>
    <last-name>Doe</last-name>
    <gender>female</gender>
    <age>28</age>
  </person>
</p:directory>

Provided we requested the DOM association during parsing, having the person object we can obtain the DOMElement node corresponding to this object. We can also go the other way, that is, having a DOM node from a DOM document associated with a C++/Tree object model we can obtain the corresponding object model node.

One technique that is made possible thanks to the DOM association is the use of XPath queries to locate object model nodes. This is especially useful if you have a deeply nested document and you only need to access a small part of it buried deep inside.

The idea is to run an XPath query on the underlying DOM document, obtain the result as a collection of DOM nodes and then “move up” from these DOM nodes to the object model nodes. While the DOM implementation provided by Xerces-C++ does not support XPath, there are complimentary libraries, such as XQilla, that provide this functionality. The following code fragment shows how to locate all the people from the above XML file that are older than 30. It uses XQilla and the DOM XPath API from Xerces-C++ 2.8.0:

directory& d = ...
 
// Obtain the root element and document corresponding
// to the directory object.
//
DOMElement* root (static_cast<DOMElement*> (d._node ()));
DOMDocument* doc (root->getOwnerDocument ());
 
// Obtain namespace resolver.
//
dom::auto_ptr<XQillaNSResolver> resolver (
  (XQillaNSResolver*)doc->createNSResolver (root));
 
// Set the namespace prefix for the people namespace that
// we can use reliably in XPath expressions regardless of
// what is used in XML documents.
//
resolver->addNamespaceBinding (
  xml::string ("p").c_str (),
  xml::string ("http://www.example.com/people").c_str ());
 
// Create XPath expression.
//
dom::auto_ptr<const XQillaExpression> expr (
  static_cast<const XQillaExpression*> (
    doc->createExpression (
      xml::string ("p:directory/person[age > 30]").c_str (),
      resolver.get ())));
 
// Execute the query.
//
dom::auto_ptr<XPath2Result> r (
  static_cast<XPath2Result*> (
    expr->evaluate (
      doc, XPath2Result::ITERATOR_RESULT, 0)));
 
// Iterate over the result.
//
while (r->iterateNext ())
{
  const DOMNode* n (r->asNode ());
 
  // Obtain the object model node corresponding to
  // this DOM node.
  //
  person* p (
    static_cast<person*> (
      n->getUserData (dom::tree_node_key)));
 
  // Print the data using the object model.
  //
  cout << endl
       << "First  : " << p->first_name () << endl
       << "Last   : " << p->last_name () << endl
       << "Gender : " << p->gender () << endl
       << "Age    : " << p->age () << endl;
}

As you can see the code is littered with casts to XQilla-specific types such as XQillaNSResolver, XQillaExpression, and XPath2Result. This is necessary because the DOM interface in Xerces-C++ 2-series only supports the XPath 1.0 query model and is not sufficient for XPath 2.0 implemented by XQilla.

To make the integration of XQilla with Xerces-C++ cleaner, the Xerces-C++ and XQilla developers came up with an extended DOM XPath interface that accommodated both XPath 1.0 and 2.0 query models. On the Xerces-C++ side this interface was first made public in version 3.0.0. Soon after that XQilla 2.2.0 was released with the implementation of the new interface. The above code fragment rewritten to use the new interface is shown below:

directory& d = ...
 
// Obtain the root element and document corresponding
// to the directory object.
//
DOMElement* root (static_cast<DOMElement*> (d._node ()));
DOMDocument* doc (root->getOwnerDocument ());
 
// Obtain namespace resolver.
//
dom::auto_ptr<DOMXPathNSResolver> resolver (
  doc->createNSResolver (root));
 
// Set the namespace prefix for the people namespace that
// we can use reliably in XPath expressions regardless of
// what is used in XML documents.
//
resolver->addNamespaceBinding (
  xml::string ("p").c_str (),
  xml::string ("http://www.example.com/people").c_str ());
 
// Create XPath expression.
//
dom::auto_ptr<DOMXPathExpression> expr (
  doc->createExpression (
    xml::string ("p:directory/person[age > 30]").c_str (),
    resolver.get ()));
 
// Execute the query.
//
dom::auto_ptr<DOMXPathResult> r (
  expr->evaluate (
    doc, DOMXPathResult::ITERATOR_RESULT_TYPE, 0));
 
// Iterate over the result.
//
while (r->iterateNext ())
{
  DOMNode* n (r->getNodeValue ());
 
  // Obtain the object model node corresponding to
  // this DOM node.
  //
  person* p (
    static_cast<person*> (
      n->getUserData (dom::tree_node_key)));
 
  // Print the data using the object model.
  //
  cout << endl
       << "First  : " << p->first_name () << endl
       << "Last   : " << p->last_name () << endl
       << "Gender : " << p->gender () << endl
       << "Age    : " << p->age () << endl;
}

Comments are closed.