1.2. Origins of XML

The twentieth century has been an information age unparalleled in human history. Universities churn out books and articles, the media is richer with content than ever before, and even space probes return more data about the universe than we know what to do with. Organizing all this knowledge is not a trivial concern.

Early electronic formats were more concerned with describing how things looked (presentation) than with document structure and meaning. troff and TeX, two early formatting languages, did a fantastic job of formatting printed documents, but lacked any sense of structure. Consequently, documents were limited to being viewed on screen or printed as hard copies. You couldn't easily write programs to search for and siphon out information, cross-reference it electronically, or repurpose documents for different applications.

Generic coding, which uses descriptive tags rather than formatting codes, eventually solved this problem. The first organization to seriously explore this idea was the Graphic Communications Association (GCA). In the late 1960s, the "GenCode" project developed ways to encode different document types with generic tags and to assemble documents from multiple pieces.

The next major advance was Generalized Markup Language (GML), a project by IBM. GML's designers, Charles Goldfarb, Edward Mosher, and Raymond Lorie,[1] intended it as a solution to the problem of encoding documents for use with multiple information subsystems. Documents ...

Get Learning XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.