I use xml.ETree.ElementTree to parse a large xml file sequentially using iterparse. However, over time, the application starts to eat a lot of memory. It seems that although iterparse processes the document sequentially, it saves the entire document tree. How to avoid it?
Preferably using the standard library, rather than third-party parsers.
xml.parsers.expat, example: github.com/gil9red/SimplePyScripts/blob/… But this is a very low level, but it will allow you to efficiently process giant xml - gil9red