You are previewing Office 2003 XML.
O'Reilly logo
Office 2003 XML

Book Description

In Microsoft's Office 2003, users experience the merger of the power of the classic Office suite of applications with the fluidity of data exchange inherent in XML. With XML at its heart, the new version of Microsoft's desktop suite liberates the information stored in millions of documents created with Office software over the past fifteen years, making it available to a wide variety of programs. Office 2003 XML offers an in-depth exploration of the relationship between XML and Office 2003, examining how the various products in the Office suite both produce and consume XML. Developers will learn how they can connect Microsoft Office to others systems, while power users will learn to create and analyze XML documents using familiar Office tools. The book begins with an overview of the XML features included in the various Office 2003 components, and explores in detail how Word, Excel, and Access interact with XML. This book covers both the user interface side, creating interfaces so that users can comfortably (and even unknowingly) work with XML, and the back end, exposing Office information to other processes. It also looks at Microsoft's new InfoPath application and how it fits with the rest of Office. Finally, the book's appendices introduce various XML technologies that may be useful in working with Office, including XSLT, W3C XML Schema, RELAX NG, and SOAP. Office 2003 XML provides quick and clear guidance to a anyone who needs to import or export information from Office documents into other systems. Both XML programmers and Office power will learn how to get the most from this powerful new intersection between Office 2003 and XML.

Table of Contents

  1. Office 2003 XML
    1. Preface
      1. Who Should Read This Book
      2. Who Should Not Read This Book
      3. Organization of This Book
      4. Supporting Books
      5. Conventions Used in This Book
      6. Using Code Examples
      7. How to Contact Us
      8. Acknowledgments
        1. From Evan Lenz
        2. From Mary McRae
        3. From Simon St.Laurent
    2. 1. Microsoft Office and XML
      1. Why XML?
      2. Different Faces of XML
      3. Different XML Faces of Office
        1. Word: Editing Documents
        2. Excel: Analyzing Information
        3. Access: Sharing Data
        4. InfoPath: Editing Structured Information
        5. Other Members of the Office Family
      4. Opening Office to the World
        1. Generating Word and Excel Documents from Databases
        2. Separating Content from Presentation in Word
        3. Separating Content from Analysis in Excel
        4. Creating and Editing XML in Excel
        5. Annotating Word Documents with Additional Information
        6. Exchanging Information Between Access and the World
        7. Interacting with Web Services Using InfoPath
        8. Interacting with Web Services Using Excel, Access, or Word
    3. 2. The WordprocessingML Vocabulary
      1. Introduction to WordprocessingML
        1. A Simple Example
      2. Tips for Learning WordprocessingML
      3. WordprocessingML’s Style of Markup
        1. No Mixed Content
        2. Properties Are Set Using Empty Sub-Elements
        3. No Hierarchical Document Structures
        4. All Attributes Are Namespace-Qualified
      4. A Simple Example Revisited
        1. The w:wordDocument Element
        2. The o:DocumentProperties Element
        3. The w:fonts Element
        4. The w:styles Element
        5. The w:docPr Element
        6. The wx:sect Element
        7. The w:body Element
      5. Document Structure and Formatting
        1. Runs
          1. Text and whitespace handling
          2. Tabs and breaks
          3. Run properties
          4. Associating a run with a character style
        2. Paragraphs
          1. Paragraph properties
          2. Defining tab stops
          3. Paragraph mark properties
          4. Associating a paragraph with a paragraph style
        3. Tables
        4. Lists
          1. What makes a paragraph a list item
          2. Comparing HTML and WordprocessingML lists
          3. Finding the list definitions
          4. List Styles
        5. Sections
        6. Proofing, Protection, and Annotation Markings
          1. Document protection
          2. Proof errors
          3. Comments and other annotations
      6. Auxiliary Hints in WordprocessingML
        1. Section Containers
        2. Outline Levels and Sub-Sections
        3. List Item Formatting Hints
      7. More on Styles
        1. A Document’s Styles
        2. Default Styles
        3. Default Font Size for Paragraph Styles
        4. Derived Styles
        5. Resolving Conflicts
          1. Paragraph property conflicts
          2. Font property conflicts
        6. A Pop Quiz
        7. Dummy Styles
        8. Linked Styles
    4. 3. Using WordprocessingML
      1. Endless Possibilities
      2. Creating Word Documents
        1. Generating Data-Driven Tables
      3. Extracting Information from Word Documents
        1. Dumping a Document’s Text Content
        2. Extracting Metadata
        3. Listing Comments
      4. Modifying Word Documents
        1. Cleaning Up a Document for Publication
        2. Removing All Direct (Local) Formatting
        3. Removing Linked “Char” Styles
        4. Adjusting Font Sizes
      5. Converting Between WordprocessingML and Other Formats
        1. HTML
        2. PDF
        4. Docbook
        5. Special-Purpose Translations
    5. 4. Creating XML Templates in Word
      1. Clarifying Use Cases
      2. A Working Example
      3. Word’s Processing Model for Editing XML
      4. The Schema Library
      5. How the onload XSLT Stylesheet Is Selected
        1. Multiple Views for the Same Schema
      6. Merged XML and WordprocessingML
        1. The “Show XML Tags” Option
        2. Block-Level, Run-Level, Row-Level, and Cell-Level Tags
        3. Placeholder Text
        4. Table Rows and Repeating Elements
      7. Attaching Schemas to a Document
        1. Demystifying Schema Attachment
      8. Schema-Driven Editing
        1. The XML Structure Task Pane
        2. Editing Attributes
          1. A workaround for editing attributes
      9. Schema Validation
        1. The “Ignore Mixed Content” Document Option
      10. Document Protection
        1. Editing Restrictions
        2. Formatting Restrictions
      11. XML Save Options
        1. The “Save Data Only” Document Option
          1. Stripping mixed content
          2. Preserving processing instructions
        2. The “Apply Custom Transform” Document Option
        3. When to Use These Options
      12. Reviewing the XML-Specific Document Options
      13. Steps to Creating the onload Stylesheet
        1. Start with a Word Document
        2. Attach a Schema
        3. Apply XML Tags
        4. Convert Block-Level Leaf Tags to Run-Level Tags
        5. Assign Placeholder Text
        6. Set the XML-Related Document Options
        7. Enable Editing Restrictions
        8. Enable Formatting Restrictions
        9. Start Enforcing Protection
        10. Convert the Document to an XSLT Stylesheet
          1. A utility for generating onload stylesheets
          2. Manually customizing the onload stylesheet
      14. Deploying the Template
        1. The Initial XML Template File
        2. The Manifest File
      15. Limitations of Word 2003’s XML Support
        1. Schemas and Namespaces
        2. Document Protection Doesn’t Go Far Enough
        3. Document Protection Conflicts with Multiple Views
        4. Only One View at a Time
    6. 5. Developing Smart Document Solutions
      1. What’s a Smart Document?
        1. Smart Document Solutions
        2. Smart Document Components
      2. Creating a Smart Document Solution
        1. Schemas
          1. Existing Word environments
          2. Existing XML (or SGML) environments
          3. Starting from scratch
            1. Customer-specific DTDs or schemas
            2. DTDs or schemas developed by committee
          4. The SDArticle schema
        2. Templates
        3. Styles
        4. Shell Instance
        5. Boilerplate
        6. XSL Transformations
      3. Coding the Smart Document
        1. Required Document Actions
        2. Designing the Document Actions Task Pane
        3. The Word Object Model
          1. XML additions to the Word object model
      4. Coding in VB.NET
        1. Creating a New Project
        2. Declaring Constants
          1. The ISmartDocument interface
          2. SmartDoc Initialization and Foundations
            1. SmartDocInitialize
            2. SmartDocXMLTypeCount
            3. SmartDocXMLTypeName
            4. SmartDocXMLTypeCaption
            5. ControlCount
            6. ControlID
            7. ControlNameFromID
            8. ControlCaptionFromID
            9. ControlTypeFromID
          3. Populating controls
            1. PopulateActiveXProps
            2. PopulateCheckbox
            3. PopulateDocumentFragment
            4. PopulateHelpContent
            5. PopulateImage
            6. PopulateListOrComboContent
            7. PopulateOther
            8. PopulateRadioGroup
            9. PopulateTextboxContent
          4. Defining document actions
            1. Adding a graphic: the ImageClick method
            2. OnTextboxContentChange
            3. OnListOrComboSelectChange
            4. InvokeControl
            5. OnCheckboxChange
            6. OnRadioGroupSelectChange
            7. OnPaneUpdateComplete
          5. Associating control types and methods
      5. Manifest Files
      6. Other Files
        1. Help files
        2. Document Fragments
      7. Attaching the Smart Document Expansion Pack
      8. Deploying Your Smart Document Solution
        1. Internal Deployment
        2. External Deployment
        3. COM Versus Managed Code
        4. Template Files
      9. A Few Last Words on Smart Documents
        1. Range and Selection Objects
        2. Inserting Markup
        3. Validation
        4. Inserting Styles
        5. Stories or Streams
      10. Some Final Thoughts
    7. 6. Working with XML Data in Excel Spreadsheets
      1. Separating Data and Logic
      2. Loading XML into an Excel Spreadsheet
        1. Tables and Trees
        2. Opening XML Documents Directly
          1. Opening documents as a list
          2. Opening documents as a read-only workbook
          3. Using the XML source task pane
        3. Working with XML Maps
          1. Excel and XML Schema
          2. Creating an XML Map
      3. Editing XML Documents in Excel
      4. Loading and Saving XML Documents from VBA
    8. 7. Using SpreadsheetML
      1. Saving and Opening XML Spreadsheets
      2. Reading XML Spreadsheets
        1. Working with More Complex Spreadsheets
      3. Extracting Information from XML Spreadsheets
      4. Creating XML Spreadsheets
      5. Editing XML Maps with SpreadsheetML
    9. 8. Importing and Exporting XML with Microsoft Access
      1. Access XML Expectations
      2. Exporting XML from Access Using the GUI
        1. Exporting a Single Table
        2. Exporting Linked Tables
        3. Exporting a Query
        4. Presentation and Transformation
      3. Importing XML into Access Using the GUI
      4. Automating XML Import and Export
    10. 9. Using Web Services in Excel, Access, and Word
      1. What Are Web Services?
      2. The Microsoft Office Web Services Toolkit
      3. Accessing a Simple Web Service from Excel
      4. Accessing More Complex Web Services
      5. Accessing REST Web Services with VBA
      6. Using Web Services in Access
      7. Using Web Services in Word
    11. 10. Developing InfoPath Solutions
      1. What Is InfoPath?
      2. InfoPath in Context
        1. The Problem
        2. Alternative Approaches
          1. Building a custom application
          2. Generic server-side frameworks
        3. Rich-Client XML Editors
          1. Browser-based versus desktop deployment
          2. Document-oriented versus data-oriented
          3. Bundled versus standalone development tool
          4. Declarative versus procedural configuration
          5. “Mapping” versus “Merging”
        4. InfoPath versus XForms
      3. Components of an InfoPath Solution
        1. The InfoPath Processing Instructions
        2. A Simple Form Definition File
        3. Defining a Form Using Only an XSLT Stylesheet
          1. Conditional formatting
        4. Explicitly Binding HTML Nodes to XML Nodes
        5. Specifying an Initial XML Template
        6. Adding a Schema
      4. A More Complete Example
        1. The XSD Schema
          1. Making a concession for design mode
        2. The Initial XML Template
        3. The XSLT Stylesheet
          1. Text bindings
          2. Rich text bindings
          3. Structural bindings
          4. Date picker control
          5. Time field formatting
        4. The Form Definition File
          1. Creating toolbars, menus, and buttons
          2. Editing components
          3. The xCollection editing component
          4. The xOptional editing component
          5. The xReplace Editing Component
          6. The xField editing component
          7. The xTextList editing component
        5. The HTML Task Pane
        6. The Script File
        7. The Cabinet Manifest
      5. Using InfoPath Design Mode
        1. Creating a Simple Solution from an XSD Schema
        2. Creating a Form from Scratch
        3. The Layout and Views Task Panes
        4. Publishing a Form from Design Mode
        5. Developing Solutions That Play Nice with Design Mode
    12. A. The XML You Need for Office
      1. What Is XML?
      2. Anatomy of an XML Document
        1. Elements and Attributes
        2. Name Syntax
        3. XML Namespaces
        4. Well-Formedness
        5. Comments and Processing Instructions
        6. Entity References
        7. Character References
        8. Character Encodings
          1. Unicode encoding schemes
          2. Other character encodings
        9. Validity
          1. Document Type Definitions (DTDs)
          2. Connecting DTDs to documents
    13. B. The XSLT You Need for Office
      1. Sorting Out the Acronyms
        1. What Is XSL?
        2. What Is XSLT?
        3. What Is XPath?
      2. A Simple Template Approach
      3. A Rule-Based Stylesheet
      4. A More Advanced Example
      5. Conclusion
    14. C. The XSD You Need for Office
      1. What Is XSD?
      2. Creating a Simple Schema
      3. Schema Parts
        1. Namespaces
        2. Named and Anonymous Type Definitions
        3. Datatypes
        4. Varied Document Structures
        5. When Anything Is Allowed
        6. Model Groups
        7. Empty Content, Mixed Content, and Default Values
        8. Annotations
        9. Other Features
      4. Working with XML Schema
    15. D. Using DTDs and RELAX NG Schemas with Office
      1. What Are DTDs?
        1. Element Type Declarations
        2. Attribute List Declarations
        3. Putting it Together
        4. Other DTD Features
      2. What Is RELAX NG?
        1. A Basic RELAX NG Schema
        2. Advanced Features: Namespaces and Datatypes
      3. How Do I Convert DTDs and RELAX NG to XSD?
    16. Index
    17. Colophon