From Damon.Southworth at uk.bosch.com Mon Apr 28 14:14:27 2025 From: Damon.Southworth at uk.bosch.com (Southworth Damon (ETAS-SID/XPC-Mnc2)) Date: Mon Apr 28 14:14:47 2025 Subject: [xsd-users] CodeSynthesis DOm and Object Model Message-ID: Hi, We think that we might be hitting an issue with the auto generated parsing code and our object model, so I would like to run through what we are experiencing to kick off a conversation. We have some XML data for our schema that is larger than usual, approximately 30Mb. We are parsing this with the usual defaults so don't specify own_dom. The data is being read from an istream from which the generated code creates an input source for a sax parser and hands it off to another constructor to create the DOM. ::xsd::cxx::xml::sax::std_input_source isrc (is, sid); ... ::xml_schema::dom::unique_ptr< ::xercesc::DOMDocument > d ( ::xsd::cxx::xml::dom::parse< char > ( i, h, p, f)); The 30Mb of XML is parsed into a DOM ~340Mb. This is again handed off to a another constructor which uses the object factory to create our object. ::std::unique_ptr< ::xsd::cxx::tree::type > tmp ( ::xsd::cxx::tree::type_factory_map_instance< 0, char > ().create ( "comms", "http://www.bosch-automotive.com/gxm/comms", &::xsd::cxx::tree::factory_impl< ::gxm::comms::Comms >, true, true, e, n, f, 0)); if (tmp.get () != 0) { ::std::unique_ptr< ::gxm::comms::Comms > r ( dynamic_cast< ::gxm::comms::Comms* > (tmp.get ())); This creates our Comms object ~70Mb and returns, releasing the DOM. All good. However, if we go round again a second time, to create another object from the same XML, this time after creating the second 70Mb object, the DOM is not released and the 350Mb DOM does not appear to be returned to the system. We are monitoring this with the kernel process stats. I am using the latest xsd compiler 4.2.0 and xerces-c 3.2.5 on a Linux system with gcc-11. We are not sure why we have started to observe this behaviour. Maybe because it is particularly large or because we are parsing the same document multiple times, maybe both? It is certainly not typical to parse the same document over again. Damon Southworth Principal Software Engineer - Vehicle Diagnostics T +44 161 491 9182 damon.southworth@etas.com ETAS Ltd., ETAS-MSS/XPC-Mnc2 8th Floor, No 2 Circle Square, 1 Symphony Park, Oxford Road, Manchester M1 7FS, United Kingdom [www.etas.com]www.etas.com ETAS ? Empowering Tomorrow?s Automotive Software Follow us: YouTube, LinkedIn, RSS News, RSS Download Center VAT Number: GB 698 0218 08, Company Registration Number: 3383737 Registered Office: ETAS Limited, c/o Robert Bosch Limited, Broadwater Park, North Orbital Road, Denham, UB9 5HJ Confidentiality: This e-mail transmission (and/or any attachment accompanying it) is confidential and intended solely for the person or organisation to whom it is addressed. If you are not the intended recipient, you must not disclose, copy or distribute any information in this transmission or take any action in reliance on its contents. If you have received this e-mail in error, please promptly notify the sender by reply e-mail and then delete it. From boris at codesynthesis.com Tue Apr 29 09:48:38 2025 From: boris at codesynthesis.com (Boris Kolpackov) Date: Tue Apr 29 09:47:05 2025 Subject: [xsd-users] CodeSynthesis DOm and Object Model In-Reply-To: References: Message-ID: Southworth Damon (ETAS-SID/XPC-Mnc2) writes: > However, if we go round again a second time, to create another object > from the same XML, this time after creating the second 70Mb object, > the DOM is not released and the 350Mb DOM does not appear to be > returned to the system. My first thought is the Xerces-C++ memory pool. In essence, it has its own allocator which may hold on to the released memory. An easy way to test this theory would be to de-initialize the Xerces-C++ runtime after constructing the object model and then initialize it again before the subsequent parse. I am surprised you only observe this on the second parse, though.