XML Structure in a Nutshell

The basic structure of an XML document is simple. Most can be reduced to a few simple components. Consider the following:

<?xml version="1.0"?>
<PurchaseOrder>
  <account refnum="2390094"/>
  <item sku="33-993933" qty="4">
    <name>Potato Smasher</name>
    <description>Smash Potatoes like never before.</description>
  </item>
</PurchaseOrder>

In this example, the first line, starting with the <? characters, is the XML declaration. It states which version of XML is being used and can also include information about the character encoding of the document. The text starting with <PurchaseOrder> and ending with </PurchaseOrder> is an XML element. An element must have an opening and closing tag, or the opening tag must end with the characters /> if it is to be empty. The account element shown here is an example of an empty element that ends with a />. The item element opens, contains two other elements, and then closes. The sku="33-993933" expression is an attribute named sku with its value 33-993933 in quotes. An element can have as many attributes as needed. Both the name and description elements are followed by character data or text. Finally, the elements are closed and the document terminates.

In the remainder of this chapter, we walk through the relevant parts of the XML specification, highlighting the most important items for you to be aware of as you embark on coding with Python and XML.

Get Python & XML now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.