December 6th, 2006
While testing the XSD-generated code on IBM XL C++ 7.0, I discovered an interesting difference between expressing the same semantic using default arguments and function overloading. Consider the following code snippet:
template <typename X>
struct sequence
{
void resize (size_t, X const& x = X ());
};
What happens when the template argument for X
does not have a default constructor? The majority of C++ compilers think this is fine as long as you don’t call resize
with the default values for its second argument. But IBM XL C++ 7.0 does not. While I agree that we only need the default constructor at the function’s call site, it is still a part of the interface. If we were to write something like this:
template <typename X>
struct sequence
{
void f (typename X::foo);
};
And the template argument for X
didn’t have a type named foo
, then it would have been an error even though we might never actually have called f
. Fortunately, it is fairly easy to resolve this issue by rewriting the original example using overloading instead of the default argument:
template <typename X>
struct sequence
{
void resize (size_t);
void resize (size_t, X const&);
};
Posted in C++ | Comments Off
November 28th, 2006
If you are using Xerces-C++ DOM then you might want to know about a few functions that you probably shouldn’t use. Or, at least, think twice before using. These are getChildNodes
and getTextContent
.
There is nothing wrong with getChildNodes
per se. It returns DOMNodeList
which has the DOMNode* item (size_t index)
member function. The problem is actually with the item
function which does its job in O(n) instead of O(1) as one would expect. As a result, you would be better off rewriting your DOMNodeList
-based iterations like this:
for (DOMNode* n (e.getFirstChild ());
n != 0;
n = n->getNextSibling ())
{
...
}
The problem with getTextContent
lies in the memory management area. This function goes over child nodes accumulating text in a buffer which it returns to you at the end. Important part to know is that this buffer is allocated on the document heap and will only be freed when you destroy the document. Imagine an application that loads a DOM document at the beginning and then performs multiple queries (which involve calling getTextContent
) on this single document.
Here is my implementation of text_content
which does its job without leaking memory. Note that it has a bit different semantic compared to the standard getTextContent
. In particular, it only checks for the child text nodes and it throws if it sees nested DOMElement
(no mixed content):
#include <string>
#include <xercesc/dom/DOMNode.hpp>
#include <xercesc/dom/DOMText.hpp>
#include <xercesc/dom/DOMElement.hpp>
#include <xercesc/util/XMLString.hpp>
struct mixed_content {};
std::string
text_content (const xercesc::DOMElement& e)
{
std::string r;
using xercesc::DOMNode;
using xercesc::DOMText;
using xercesc::XMLString;
for (DOMNode* n (e.getFirstChild ());
n != 0;
n = n->getNextSibling ())
{
switch (n->getNodeType ())
{
case DOMNode::TEXT_NODE:
case DOMNode::CDATA_SECTION_NODE:
{
DOMText* t (static_cast<DOMText*> (n));
char* str (XMLString::transcode (t->getData ()));
r += str;
XMLString::release (&str);
break;
}
case DOMNode::ELEMENT_NODE:
{
throw mixed_content ();
}
}
}
return r;
}
Posted in XML, C++ | Comments Off