[xsd-users] Poor performance of Unicode conversion
Boris Kolpackov
boris at codesynthesis.com
Tue Oct 2 06:13:43 EDT 2007
Hi Ray,
Boris Kolpackov <boris at codesynthesis.com> writes:
> Ray Lischner <rlischner at proteus-technologies.com> writes:
>
> > DOM-to-object model stage. Some of our schemas are predominantly
> > strings.
>
> Ok, good. I am going to profile this and see if anything can be
> optimized. I will get back to you with the results.
I did some profiling of the DOM-to-object model stage in 3.0.0. I
used an XML instance with the following fragment repeated for 300Kb:
<dummyBindingTest>
<aIntElem>42</aIntElem>
<aDoubleElem>42345.4232</aDoubleElem>
<aNameElem>aName123_45</aNameElem>
<aTimeElem>2005-10-28T09:00:13.323257</aTimeElem>
<optString>bla bla bla3257</optString>
<choice1>1st choice</choice1>
<enumElem>ENUM1</enumElem>
</dummyBindingTest>
The test executable performs 1000 DOM-to-object parse iterations and is
statically-linked so Xerces-C++, libstdc++, etc., are all counted in the
results. Here is the top part of the oprofile output:
CPU: AMD64 processors, speed 1793.09 MHz (estimated)
samples % symbol name
35869 13.4132 xsd::cxx::xml::bits::char_transcoder<char>::to(unsigned short const*, unsigned long)
19841 7.4196 _int_free
17425 6.5161 xsd::cxx::xml::qualified_name<char> xsd::cxx::xml::dom::name<char>(xercesc_2_8::DOMElement const&)
16646 6.2248 std::basic_string<char, std::char_traits<char>, std::allocator<char> > xsd::cxx::tree::text_content<char>(xercesc_2_8::DOMElement const&)
13952 5.2174 _int_malloc
13515 5.0539 std::string::compare(char const*) const
12593 4.7092 strlen
9303 3.4789 free
7600 2.8420 malloc
7046 2.6349 DummyBindingTest::parse(xsd::cxx::xml::dom::parser<char>&, xsd::cxx::tree::flags)
6247 2.3361 malloc_consolidate
6146 2.2983 memset
6026 2.2534 std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_float(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, std::string&) const
5127 1.9172 xercesc_2_8::DOMElementImpl::getNodeType() const
5005 1.8716 __gnu_cxx::__exchange_and_add(int volatile*, int)
4752 1.7770 std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long)
4517 1.6891 ____strtod_l_internal
4466 1.6701 memcpy
4335 1.6211 std::string::reserve(unsigned long)
3325 1.2434 operator new(unsigned long)
2939 1.0990 memchr
2930 1.0957 xercesc_2_8::DOMTextImpl::getNodeType() const
2890 1.0807 xsd::cxx::tree::token<char, xsd::cxx::tree::normalized_string<char, xsd::cxx::tree::string<char, xsd::cxx::tree::simple_type<xsd::cxx::tree::_type> > > >::token(xercesc_2_8::DOMElement const&, xsd::cxx::tree::flags, xsd::cxx::tree::_type*)
2865 1.0714 std::string::append(unsigned long, char)
The test parses the above 300Kb document in about 15ms. As you can see,
the UTF-16 to UTF-8 conversion only takes 13% of the time. So even if we
eliminate it completely, the overall speedup can't be by more than 13%.
I did some optimizations in the top three functions (char_transcoder::to,
xsd::cxx::xml::dom::name, and xsd::cxx::tree::text_content) and managed
to squeeze 3ms (20% speedup) so the time now is about 12ms per document.
Here is the oprofile output after the optimizations:
samples % symbol name
27195 11.7552 xsd::cxx::xml::bits::char_transcoder<char>::to(unsigned short const*, unsigned long)
16805 7.2640 xsd::cxx::xml::qualified_name<char> xsd::cxx::xml::dom::name<char>(xercesc_2_8::DOMElement const&)
15071 6.5145 std::string::compare(char const*) const
14168 6.1242 strlen
11409 4.9316 _int_free
9647 4.1700 _int_malloc
9421 4.0723 DummyBindingTest::parse(xsd::cxx::xml::dom::parser<char>&, xsd::cxx::tree::flags)
9352 4.0424 std::basic_string<char, std::char_traits<char>, std::allocator<char> > xsd::cxx::tree::text_content<char>(xercesc_2_8::DOMElement const&)
6873 2.9709 free
6101 2.6372 malloc_consolidate
6069 2.6234 memset
5645 2.4401 std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > >::_M_extract_float(std::istreambuf_iterator<char, std::char_traits<char> >, std::istreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, std::_Ios_Iostate&, std::string&) const
5438 2.3506 xercesc_2_8::DOMElementImpl::getNodeType() const
4871 2.1055 malloc
4233 1.8297 std::locale::~locale()
4231 1.8289 ____strtod_l_internal
3383 1.4623 DummyEnumType::~DummyEnumType()
3351 1.4485 xsd::cxx::tree::string<char, xsd::cxx::tree::simple_type<xsd::cxx::tree::_type> >::~string()
3023 1.3067 xercesc_2_8::DOMTextImpl::getLength() const
2961 1.2799 memchr
2895 1.2514 xercesc_2_8::DOMTextImpl::getNodeType() const
2830 1.2233 std::string::append(unsigned long, char)
2587 1.1182 std::string::reserve(unsigned long)
2418 1.0452 xercesc_2_8::DOMElementImpl::getNextSibling() const
2341 1.0119 __gnu_cxx::__exchange_and_add(int volatile*, int)
The optimizations are for XSD 3.0.0 and will appear in the next release.
I can also backport them to 2.3.1 if anybody is interested but cannot
upgrade to 3.0.0.
Boris
More information about the xsd-users
mailing list