[xsd-users] Poor performance of Unicode conversion

Boris Kolpackov boris at codesynthesis.com
Tue Oct 2 06:13:43 EDT 2007


Hi Ray,

Boris Kolpackov <boris at codesynthesis.com> writes:

> Ray Lischner <rlischner at proteus-technologies.com> writes:
>
> > DOM-to-object model stage. Some of our schemas are predominantly
> > strings.
>
> Ok, good. I am going to profile this and see if anything can be
> optimized. I will get back to you with the results.

I did some profiling of the DOM-to-object model stage in 3.0.0. I
used an XML instance with the following fragment repeated for 300Kb:

  <dummyBindingTest>
    <aIntElem>42</aIntElem>
    <aDoubleElem>42345.4232</aDoubleElem>
    <aNameElem>aName123_45</aNameElem>
    <aTimeElem>2005-10-28T09:00:13.323257</aTimeElem>
    <optString>bla bla bla3257</optString>
    <choice1>1st choice</choice1>
    <enumElem>ENUM1</enumElem>
  </dummyBindingTest>


The test executable performs 1000 DOM-to-object parse iterations and is
statically-linked so Xerces-C++, libstdc++, etc., are all counted in the
results. Here is the top part of the oprofile output:

CPU: AMD64 processors, speed 1793.09 MHz (estimated)

samples  %        symbol name
35869    13.4132  xsd::cxx::xml::bits::char_transcoder<char>::to(unsigned short const*, unsigned long)
19841     7.4196  _int_free
17425     6.5161  xsd::cxx::xml::qualified_name<char> xsd::cxx::xml::dom::name<char>(xercesc_2_8::DOMElement const&)
16646     6.2248  std::basic_string<char, std::char_traits<char>, std::allocator<char> > xsd::cxx::tree::text_content<char>(xercesc_2_8::DOMElement const&)
13952     5.2174  _int_malloc
13515     5.0539  std::string::compare(char const*) const
12593     4.7092  strlen
9303      3.4789  free
7600      2.8420  malloc
7046      2.6349  DummyBindingTest::parse(xsd::cxx::xml::dom::parser<char>&, xsd::cxx::tree::flags)
6247      2.3361  malloc_consolidate
6146      2.2983  memset
6026      2.2534  std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_float(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, std::string&) const
5127      1.9172  xercesc_2_8::DOMElementImpl::getNodeType() const
5005      1.8716  __gnu_cxx::__exchange_and_add(int volatile*, int)
4752      1.7770  std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)
4517      1.6891  ____strtod_l_internal
4466      1.6701  memcpy
4335      1.6211  std::string::reserve(unsigned long)
3325      1.2434  operator new(unsigned long)
2939      1.0990  memchr
2930      1.0957  xercesc_2_8::DOMTextImpl::getNodeType() const
2890      1.0807  xsd::cxx::tree::token<char, xsd::cxx::tree::normalized_string<char, xsd::cxx::tree::string<char, xsd::cxx::tree::simple_type<xsd::cxx::tree::_type> > > >::token(xercesc_2_8::DOMElement const&, xsd::cxx::tree::flags, xsd::cxx::tree::_type*)
2865      1.0714  std::string::append(unsigned long, char)

The test parses the above 300Kb document in about 15ms. As you can see,
the UTF-16 to UTF-8 conversion only takes 13% of the time. So even if we
eliminate it completely, the overall speedup can't be by more than 13%.

I did some optimizations in the top three functions (char_transcoder::to,
xsd::cxx::xml::dom::name, and xsd::cxx::tree::text_content) and managed
to squeeze 3ms (20% speedup) so the time now is about 12ms per document.
Here is the oprofile output after the optimizations:

samples  %        symbol name
27195    11.7552  xsd::cxx::xml::bits::char_transcoder<char>::to(unsigned short const*, unsigned long)
16805     7.2640  xsd::cxx::xml::qualified_name<char> xsd::cxx::xml::dom::name<char>(xercesc_2_8::DOMElement const&)
15071     6.5145  std::string::compare(char const*) const
14168     6.1242  strlen
11409     4.9316  _int_free
9647      4.1700  _int_malloc
9421      4.0723  DummyBindingTest::parse(xsd::cxx::xml::dom::parser<char>&, xsd::cxx::tree::flags)
9352      4.0424  std::basic_string<char, std::char_traits<char>, std::allocator<char> > xsd::cxx::tree::text_content<char>(xercesc_2_8::DOMElement const&)
6873      2.9709  free
6101      2.6372  malloc_consolidate
6069      2.6234  memset
5645      2.4401  std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_float(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, std::string&) const
5438      2.3506  xercesc_2_8::DOMElementImpl::getNodeType() const
4871      2.1055  malloc
4233      1.8297  std::locale::~locale()
4231      1.8289  ____strtod_l_internal
3383      1.4623  DummyEnumType::~DummyEnumType()
3351      1.4485  xsd::cxx::tree::string<char, xsd::cxx::tree::simple_type<xsd::cxx::tree::_type> >::~string()
3023      1.3067  xercesc_2_8::DOMTextImpl::getLength() const
2961      1.2799  memchr
2895      1.2514  xercesc_2_8::DOMTextImpl::getNodeType() const
2830      1.2233  std::string::append(unsigned long, char)
2587      1.1182  std::string::reserve(unsigned long)
2418      1.0452  xercesc_2_8::DOMElementImpl::getNextSibling() const
2341      1.0119  __gnu_cxx::__exchange_and_add(int volatile*, int)


The optimizations are for XSD 3.0.0 and will appear in the next release.
I can also backport them to 2.3.1 if anybody is interested but cannot
upgrade to 3.0.0.


Boris




More information about the xsd-users mailing list