[xsd-users] Segmentation fault in application when parsing large
XML with default (8MiB) stack size
Boris Kolpackov
boris at codesynthesis.com
Fri Mar 1 08:40:03 EST 2019
Tempelaar E. (Erik) <Erik.Tempelaar at vanoord.com> writes:
> Our app crashes with segfault when it tries to parse a rather large
> XML-file (close to the stacksize of 8MiB). As a workaround the stack
> for the application has been increased.
>
> #5 0x08224055 in xercesc_3_2::RegularExpression::matchUnion(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
> #6 0x08221097 in xercesc_3_2::RegularExpression::match(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
> #7 0x08224090 in xercesc_3_2::RegularExpression::matchUnion(xercesc_3_2::RegularExpression::Context*, xercesc_3_2::Op const*, unsigned int) const ()
>
> ... this continues ...
I don't think it's the size of the XML document (most of the content
should be allocated on the heap) but rather the regex implementation
in Xerces-C++ which is apparently recursive/stack-based.
Other than re-implementing the regex support in Xerces-C++, the only
viable workaround is to reimplement/remove the regex constraint in
your schema that causes this (or, disable XML Schema validation
altogether). I would first try to come up with a "better" regex that
doesn't trigger this.
More information about the xsd-users
mailing list