You are previewing Effective XML: 50 Specific Ways to Improve Your XML.
O'Reilly logo
Effective XML: 50 Specific Ways to Improve Your XML

Book Description

Praise for Effective XML

“This is an excellent collection of XML best practices: essential reading for any developer using XML. This book will help you avoid common pitfalls and ensure your XML applications remain practical and interoperable for as long as possible.”

     —Edd Dumbill, Managing Editor, XML.com and Program Chair, XML Europe

“A collection of useful advice about XML and related technologies. Well worth reading both before, during, and after XML application development.”

     —Sean McGrath, CTO, Propylon

“A book on many best practices for XML that we have been eagerly waiting for.”

     —Akmal B. Chaudhri, Editor, IBM developerWorks

“The fifty easy-to-read items cover many aspects of XML, ranging from how to use markup effectively to what schema language is best for what task. Sometimes controversial, but always relevant, Elliotte Rusty Harold’s book provides best practices for working with XML that every user and implementer of XML should be aware of.”

     —Michael Rys, Ph.D., Program Manager, SQL Server XML Technologies, Microsoft Corporation

Effective XML is an excellent book with perfect timing. Finally, an XML book everyone needs to read! Effective XML is a fount of XML best practices and solid advice. Whether you read Effective XML cover to cover or randomly one section at a time, its clear writing and insightful recommendations enlighten, entertain, educate, and ultimately improve the effectiveness of even the most expert XML developer. I’ll tell you what I tell all my coworkers and customers: You need this book.”

     —Michael Brundage, Technical Lead, XML Query Processing, Microsoft WebData XML Team

“This book provides great insight for all developers who write XML software, regardless of whether the software is a trivial application-specific XML processor or a fullblown W3C XML Schema Language validator. Mr. Harold covers everything from a very important high-level terminology discussion to details about parsed XML nodes. The well-researched comparisons of currently available XML-related software products, as well as the key criteria for selecting between XML technologies, exemplify the thoroughness of this book.”

     —Cliff Binstock, Author, The XML Schema Complete Reference

If you want to become a more effective XML developer, you need this book. You will learn which tools to use when in order to write legible, extensible, maintainable and robust XML code.

Page 36: How do you write DTDs that are independent of namespace prefixes? Page 82: What do parsers reliably report and what don't they? Page 130: Which schema language is the right one for your job? Page 178: Which API should you choose for maximum speed and minimum size? Page 257: What can you do to ensure fast, reliable access to DTDs and schemas without making your document less portable? Page 283: Is XML too verbose for your application?

Elliotte Rusty Harold provides you with 50 practical rules of thumb based on real-world examples and best practices. His engaging writing style is easy to understand and illustrates how you can save development time while improving your XML code. Learn to write XML that is easy to edit, simple to process, and is fully interoperable with other applications and code. Understand how to design and document XML vocabularies so they are both descriptive and extensible. After reading this book, you'll be ready to choose the best tools and APIs for both large-scale and small-scale processing jobs. Elliotte provides you with essential information on building services such as verification, compression, authentication, caching, and content management.

If you want to design, deploy, or build better systems that utilize XML—then buy this book and get going!



Table of Contents

  1. Copyright
  2. Praise for Effective XML
  3. Effective Software Development Series
  4. Preface
  5. Introduction
  6. Syntax
    1. Include an XML Declaration
      1. The version Info
      2. The encoding Declaration
      3. The standalone Declaration
    2. Mark Up with ASCII if Possible
    3. Stay with XML 1.0
      1. New Characters in XML Names
      2. C0 Control Characters
      3. C1 Control Characters
      4. NEL Used as a Line Break
      5. Unicode Normalization
      6. Undeclaring Namespace Prefixes
    4. Use Standard Entity References
    5. Comment DTDs Liberally
      1. The Header Comment
      2. Declarations
    6. Name Elements with Camel Case
    7. Parameterize DTDs
      1. Parameterizing Attributes
      2. Parameterizing Namespaces
      3. Full Parameterization
      4. Conditional Sections
    8. Modularize DTDs
    9. Distinguish Text from Markup
    10. White Space Matters
      1. The xml:space Attribute
      2. Ignorable White Space
      3. Tags and White Space
      4. White Space in Attributes
      5. Schemas
  7. Structure
    1. Make Structure Explicit through Markup
      1. Tag Each Unit of Information
      2. Avoid Implicit Structure
      3. Where to Stop?
    2. Store Metadata in Attributes
    3. Remember Mixed Content
    4. Allow All XML Syntax
    5. Build on Top of Structures, Not Syntax
      1. Empty-Element Tags
      2. CDATA Sections
      3. Character and Entity References
    6. Prefer URLs to Unparsed Entities and Notations
    7. Use Processing Instructions for Process-Specific Content
      1. Style Location
      2. Overlapping Markup
      3. Page Formatting
      4. Out-of-Line Markup
      5. Misuse of Processing Instructions
    8. Include All Information in the Instance Document
    9. Encode Binary Data Using Quoted Printable and/or Base64
      1. Quoted Printable
      2. Base64
    10. Use Namespaces for Modularity and Extensibility
      1. Choosing a Namespace URI
      2. Validation and Namespaces
    11. Rely on Namespace URIs, Not Prefixes
    12. Don't Use Namespace Prefixes in Element Content and Attribute Values
    13. Reuse XHTML for Generic Narrative Content
    14. Choose the Right Schema Language for the Job
      1. The W3C XML Schema Language
      2. Document Type Definitions
      3. RELAX NG
      4. Schematron
      5. Java, C#, Python, and Perl
      6. Layering Schemas
    15. Pretend There's No Such Thing as the PSVI
    16. Version Documents, Schemas, and Stylesheets
    17. Mark Up According to Meaning
  8. Semantics
    1. Use Only What You Need
    2. Always Use a Parser
    3. Layer Functionality
    4. Program to Standard APIs
      1. SAX
      2. DOM
      3. JDOM
    5. Choose SAX for Computer Efficiency
    6. Choose DOM for Standards Support
    7. Read the Complete DTD
    8. Navigate with XPath
    9. Serialize XML with XML
    10. Validate Inside Your Program with Schemas
      1. Xerces-J
      2. DOM Level 3 Validation
  9. Implementation
    1. Write in Unicode
      1. Choosing an Encoding
      2. A char Is Not a Character
      3. Normalization Forms
      4. Sorting
    2. Parameterize XSLT Stylesheets
    3. Avoid Vendor Lock-In
    4. Hang On to Your Relational Database
    5. Document Namespaces with RDDL
      1. Natures
      2. Purposes
    6. Preprocess XSLT on the Server Side
      1. Servlet-Based Solutions
      2. Apache
      3. IIS
    7. Serve XML+CSS to the Client
    8. Pick the Correct MIME Media Type
    9. Tidy Up Your HTML
      1. MIME Type
      2. HTML Tidy
      3. Older Browsers
    10. Catalog Common Resources
      1. Catalog Syntax
      2. Using Catalog Files
    11. Verify Documents with XML Digital Signatures
      1. Digital Signature Syntax
      2. Digital Signature Tools
    12. Hide Confidential Data with XML Encryption
      1. Encryption Syntax
      2. Encryption Tools
    13. Compress if Space Is a Problem
  10. Recommended Reading