[xsd-users] Segmentation fault in application when parsing large XML with default (8MiB) stack size

Boris Kolpackov boris at codesynthesis.com
Fri Mar 1 08:40:03 EST 2019


Tempelaar E. (Erik) <Erik.Tempelaar at vanoord.com> writes:

> Our app crashes with segfault when it tries to parse a rather large
> XML-file (close to the stacksize of 8MiB). As a workaround the stack
> for the application has been increased.
>
> #5  0x08224055 in xercesc_3_2::RegularExpression::matchUnion(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
> #6  0x08221097 in xercesc_3_2::RegularExpression::match(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
> #7  0x08224090 in xercesc_3_2::RegularExpression::matchUnion(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
>
> ... this continues ...

I don't think it's the size of the XML document (most of the content
should be allocated on the heap) but rather the regex implementation
in Xerces-C++ which is apparently recursive/stack-based.

Other than re-implementing the regex support in Xerces-C++, the only
viable workaround is to reimplement/remove the regex constraint in
your schema that causes this (or, disable XML Schema validation
altogether). I would first try to come up with a "better" regex that
doesn't trigger this.



More information about the xsd-users mailing list