“Gotcha!”

As in previous chapters, we again revisit some of the common pitfalls for new XML Java developers. In this chapter, we have focused on the Document Object Model, and this section continues that emphasis. Although some of the points made here are more informational than directly affective on your programming, they can be helpful in making design decisions about when to use DOM, as well as instrumental in understanding what is going on “under the hood” of your XML applications.

Memory and Performance with DOM

We spent a lot of time earlier looking at the reasons to use DOM and the reasons to use SAX. Although it was emphasized that using the DOM requires that the entire XML document be read into memory and stored in a tree structure, enough cannot be said on the subject. All too common is the scenario where a developer loads up his extensive collection of complex XML documents into an XSLT processor and begins a series of offline transformations, leaving the process to grab a bite to eat. Upon returning, he finds that his Windows machine is showing the dreaded “blue screen of death” and his Linux box is screaming about memory problems. For this developer and the hundreds like him, beware the DOM for large data!

Using the DOM requires an amount of memory proportional to the size and complexity of an XML document. There is no way to avoid this relationship, and no way to lower the memory requirements. In addition, transformations themselves are often expensive operations; combined ...

Get Java and XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.