Input

By far the hardest part of this or any similar problem is parsing the non-XML input data. Everything else pales by comparison. Unlike parsing XML, you generally cannot rely on a library to do the hard work for you. You have to do it yourself. And also unlike XML, there's little guarantee that the data is well-formed. More likely than not, you will encounter incorrectly formatted data.

In this case, because the records are separated into lines, I'll read each line, one at a time, using the readLine() method of java.io.BufferedReader. This method works well enough as long as the data is in a file, although it's potentially buggy when the data is served over a network socket.

Each line is dissected into its component fields inside the splitLine() ...

Get Processing XML with Java™: A Guide to SAX, DOM, JDOM, JAXP, and TrAX now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.