You are previewing Advanced Applications and Structures in XML Processing: Label Streams, Semantics Utilization and Data Query Technologies.
O'Reilly logo
Advanced Applications and Structures in XML Processing: Label Streams, Semantics Utilization and Data Query Technologies

Book Description

Advanced Applications and Structures in XML Processing: Label Streams, Semantics Utilization and Data Query Technologies reflects the significant research results and latest findings of scholars worldwide, working to explore and expand the role of XML. This collection represents an understanding of XML processing technologies in connection with both advanced applications and the latest XML processing technologies that is of primary importance. It provides the opportunity to understand topics in detail and discover XML research at a comprehensive level.

Table of Contents

  1. Copyright
  2. Editorial Advisory Board
  3. List of Reviewers
  4. Foreword
  5. Preface
    1. INTRODUCTION
    2. CHAPTER OVERVIEW
  6. Acknowledgment
  7. 1. XML Data Management
    1. 1. XML Native Storage and Query Processing
      1. ABSTRACT
      2. INTRODUCTION
      3. NATIVE XML STORAGE METHODS
        1. Breadth-First-Based Tree partitioning
        2. Depth-First-Based Tree Partitioning
      4. QUERY PROCESSING TECHNIQUES
      5. STRUCTURAL XML INDEXES
        1. Tag Name Index
        2. Path Index
        3. Bisimulation Graph
        4. F&B Bisimulation Graph
        5. Feature-Based Index
      6. CONCLUSION
      7. REFERENCES
    2. 2. XML Data Management in Object Relational Database Systems
      1. ABSTRACT
      2. INTRODUCTION
      3. 1 RATIONALE OF XML DATA MANAGEMENT IN ORDBMS
        1. 1.1 User Perspective
        2. 1.2 Data Model Analysis Perspective
        3. 1.3 ORDBMS Engineering Perspective
      4. 2 XML FUNCTIONALITY AND USECASE IN ORDBMS
        1. 2.1 XML Type as a Native Datatype in ORDBMs
        2. 2.2 Generating XML Data from Relational Data using SQL/XML generation functions
          1. SQL/XML GenXML Query:Q1
          2. R1: Result of Q1 - Two XML Documents (department->employee->project hierarchy)
        3. 2.3 Querying XML Data using XMLQuery(), XMLExists()
          1. SQL/XML Query XML Query: Q2
          2. R2: Result of Q2
          3. SQL/XML Query XML Query: Q3
        4. 2.4 Relational Access on XML using XMLTable construct
          1. SQL/XML XMLTable Query: Q4
        5. 2.5 updating XML using XQuery update Facility
          1. SQL/XML Update XML Statement-Q5
        6. 2.6 Standalone XQuery
          1. Q6: Top XQuery over relational data
      5. 3 XML STORAGE, INDEX TECHNIQUES IN ORDBMS
        1. 3.1 Decomposed storage using RDBMs
          1. 3.1.1 XML Schema Dependent Storage
          2. 3.1.2 XML Schema Independent Storage
            1. 3.1.2.1 Edge Table Approach
            2. 3.1.2.2 Node Table Approach
          3. OR_SQL for XPath Predicate Branch
          4. Node-Path SQL for XPath Predicate Branch
          5. 3.1.3 Comparison Between Schema Dependent and Independent Approach
        2. 3.2 Aggregated storage
          1. 3.2.1 CLOB, BLOB Approach
          2. 3.2.2 Tree Based Storage
          3. 3.2.3 Indexing Aggregated Storage
            1. 3.2.3.1 Path and Value Index
            2. 3.2.3.2 XMLTable Index
            3. 3.2.3.3 Integrated Path-Value Index
      6. SECTION 4: XQUERY, SQL/XML PROCESSING TECHNIQUES IN ORDBMS
        1. 4.1 XML Generation from Relational Data
        2. 4.2 Querying XML view over Relational Data
          1. O2: Optimization Result of Q6
        3. 4.3 XQuery processing
          1. 4.3.1 Integrated XQuery/SQL/XML Processors
          2. O1: Optimization Result of Q3 Using OR XML Storage
          3. O2: Optimization Result of Q3 Using Node-Path Index Over Aggregated XML Storage
          4. 4.3.2 Standalone Iterator-Based XQuery Processors
          5. 4.3.3 Hybrid XQuery Processors
          6. 4.3.4 XML Query Algebra
      7. SECTION 5: SUMMARY AND FUTURE WORK
      8. REFERENCES
    3. 3. XML Compression
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. Dictionary Encoding
        2. Huffman encoding
        3. Arithmetic Encoding
      4. CLASSIFICATION OF XML COMPRESSION TECHNIQUES
        1. Schema-Dependent compression vs. Schema-Independent compression
          1. Non-Queriable Compression vs. Queriable Compression
            1. Lazy Decompression (Compression Granule)
            2. Query Operations
            3. Selective Access
          2. Homomorphic Compression vs. Non-Homomorphic Compression
      5. REPRESENTATIVE XML SPECIFIC COMPRESSION TECHNIQUES
        1. XMill
          1. XMill Compresses XML Data Based on the Following Three Principles
          2. XMLPPM
          3. XAUST
          4. XGRIND
          5. XQzip
          6. XPRESS
          7. XBzip/XBzipIndex
          8. XCQ
          9. XQueC
          10. ISX
        2. Applications of Compression Techniques
          1. XML Data Management System
          2. Internet Applications Exchanging XML Data
      6. EVALUATION OF COMPRESSION TECHNIQUES
        1. Compression Ratio
        2. Compression Time
        3. Decompression Time
      7. DISCUSSION
      8. FUTURE RESEARCH DIRECTIONS
      9. CONCLUSION
      10. ACKNOWLEDGMENT
      11. REFERENCES
    4. 4. XML Benchmark
      1. ABSTRACT
      2. INTRODUCTION
      3. APPLICATION BENCHMARK
        1. XBench
          1. Dataset
            1. Text-Centric Documents
            2. Data-Centric Documents
          2. Data Generation
          3. Queries
        2. XMach-1
          1. System Structure
          2. Dataset
          3. Queries
        3. XMark
          1. Dataset
          2. Queries
        4. Xoo7
          1. Dataset
          2. Queries
        5. TpoX
          1. Dataset
          2. Queries
        6. Discussion
      4. MICRO BENCHMARK
        1. Introduction
        2. Michigan Benchmark
          1. Data Set
          2. Queries
        3. MemBeR
          1. MemBeR Generator
            1. Depth Based Mode
            2. Fanout Based Mode
            3. Advanced Mode
          2. Data Sets and Queries
        4. Features List
        5. XSelMark
          1. Data Set and Queries
        6. Duplication Detection benchmark
          1. Dataset
          2. Website
        7. Discussion
      5. XML DOCUMENT GENERATOR
        1. Synthetic XML Generator
          1. Path Tree
          2. Tag Names
          3. Frequency Distribution
          4. Element Values
        2. ToXgene
          1. TSL
          2. Features
            1. Element Specification
            2. Distribution Control
            3. Content Sharing
            4. Integrity Constraints
            5. Existing Data Reuse
            6. Extensibility
          3. Structure
        3. IBM XML Generator
        4. XML Generator for XSQO Research
          1. Features
          2. Functionality
            1. Insert Single Elements
            2. Set Element Values
            3. Remove Element
            4. Insert Multiple Elements
        5. Discussion
      6. REAL DATA SET
      7. FUTURE WORK
      8. CONCLUSION
      9. REFERENCES
  8. 2. XML Index and Query
    1. 5. Index Structures for XML Databases
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. Data Models
          1. Edge-Labeled Tree Data Model
          2. Node-Labeled Tree Data Model
          3. X-Path
      4. STRUCTURAL INDEXING SCHEMES FOR XML DATA
        1. Criteria for Evaluation of Structural Indexing Schemes
        2. Node Indexing Schemes
          1. Criteria for Evaluation of Node Indexes
          2. Interval Labeling Scheme
            1. Prefix Labeling Scheme
            2. Summary of Node Indexes
        3. Graph Indexing Schemes
          1. Deterministic Graph Indexes
            1. Strong DataGuide
            2. Index Fabric
          2. Non-Deterministic Graph Indexes with Backward Bisimilarity
            1. (1-index)
            2. A(k)-Index
          3. Non-Deterministic Graph Indexes with (Forward & Backward) Bisimilarity
            1. F&B-index
          4. Summary of Graph Indexes
        4. Sequence Indexing Schemes
          1. Specific Comparison Criteria of Sequence Indexes
          2. ViST (Top-Down Sequence Indexes)
          3. PRIX (Bottom-Up Sequence Indexes)
          4. Summary of Sequence Indexes
        5. Structural Indexes Critique
          1. Criteria for Comparison among Structural Indexing Schemes
          2. Comparison Among Structural Indexes
          3. Future Research Directions
      5. CONCLUSION
      6. ACKNOWLEDGMENT
      7. REFERENCES
    2. 6. Labeling XML Documents
      1. ABSTRACT
      2. INTRODUCTION
      3. STATIC XML LABELING SCHEMES
        1. Containment Labeling Schemes
          1. Dewey ID Labeling Schemes
        2. Extended Dewey Labeling Scheme
          1. Extended Dewey
          2. Finite State Transducer (FST)
          3. Properties of Extended Dewey
      4. DYAMIC XML LABELING SOHEMES
        1. Region-Based Dynamic Labeling Schemes
        2. Prefix-Based Dynamic Labeling Schemes
          1. ORDPATH Labeling Scheme
          2. DDE Labeling Scheme
        3. Prime Labeling Scheme
        4. The Encoding Schemes
          1. Dynamic Formats
          2. Encoding Algorithm
          3. Application of Encoding Schemes
      5. CONCLUSION
      6. REFERENCES
      7. ADDITIONAL READING
    3. 7. Keyword Search on XML Data
      1. INTRODUCTION
      2. 1.1 QUERY MODEL AND QUERY RESULT
        1. 1.1.1 Keyword Query
          1. 1.1.2 Query Result
      3. 1.2 WHAT ARE THE PROBLEMS INVOLVED?
      4. 1.3 IDENTIFYING RELEVANT KEYWORD MATCHES
        1. 1.3.1 XML Trees
          1. 1.3.1.1 Identification Strategies
          2. 1.3.1.2 Desirable Properties for Evaluation
        2. 1.3.2 XML Graphs
      5. 1.4 IDENTIFYING OTHER RELEVANT DATA NODES
      6. 1.5 GENERATING QUERY RESULTS EFFICIENTLY
        1. 1.5.1 Indexes
        2. 1.5.2 Materialized views
      7. 1.6 RANKING
      8. 1.7 RESULT SNIPPETS
      9. 1.8 CONCLUSION AND FUTURE RESEARCH DIRECTIONS
      10. REFERENCES
    4. 8. A Framework for Cost-Based Query Optimization in Native XML Database Management Systems
      1. ABSTRACT
      2. INTRODUCTION
        1. Motivation
        2. Contribution
      3. BACKGROUND
        1. Challenges of Native XML Query Optimization
        2. Related Work
          1. Important Concepts of Relational Cost-based Query Optimization
          2. Software Engineering Approaches to Query Optimization
          3. First Steps Towards Cost-Based XML Query Optimization
      4. THE QUERY EVALUATION PROCESS
      5. MOTIVATING EXAMPLE
      6. A FRAMEWORK FOR COST-BASED NATIVE XML QUERY OPTIMIZATION
        1. Design Goals
        2. Preliminaries
        3. XQuery Fragment Considered
        4. The Architecture of the Query Optimization Framework
          1. The Plan Generator
          2. The State Space
          3. The Search Strategy
          4. The Transformer Component
            1. The Implementation Manager
          5. The Cost Model and the Cost Estimator
          6. The Translator Component
          7. The Physical Algebra
      7. AN INSTANCE OF THE QUERY OPTIMIZATION FRAMEWORK
        1. Access Paths
        2. Join Operators
      8. EMPIRICAL EVALUATION
        1. Experimental Setup
        2. XpathMark Queries
        3. XMark Benchmark Queries
      9. FUTURE RESEARCH DIRECTIONS
      10. CONCLUSION
      11. ACKNOWLEDGMENT
      12. REFERENCES
      13. ADDITIONAL READING
      14. ENDNOTES
  9. 3. XML Stream Processing, Publish/Subscribe, and P2P
    1. 9. XML Stream Processing: Stack-Based Algorithms
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. Queries
          1. XPath and XQuery
          2. Twig Query
          3. Generalized Tree Pattern
        2. Data
        3. Technical Challenges
          1. Exponential Enumeration
          2. Buffering Data
          3. Sharing
        4. Preliminaries: Technology background
          1. Related Approaches
            1. Automaton-Based Filtering Algorithms
            2. Automaton-Based Query Algorithms
            3. Non-Automaton-Based Filtering Algorithms
            4. Single-Stack Architecture
          2. Holistic Twig Join
            1. Structural Join
            2. PathStack/TwigStack
          3. Benefits of Stack-Based Architecture
      4. STACK-BASED SINGLE-QUERY PROCESSING
        1. Baseline Framework
        2. TwigM
        3. LQ/EQ
          1. LQ
          2. EQ
        4. Twig2 stack
          1. Hierarchical Stack Encoding
          2. Twig2 Stack
          3. Optimization for Memory Cost
        5. Stream TX
        6. Discussion: A unified Approach
          1. Buffer Optimality for a Stream Twig Query Processing
          2. Optimization for a Twig with Predicate Nodes
      5. STACK-BASED MULTI-QUERY PROCESSING
        1. IndexFilter
        2. AFilter
        3. GFilter
        4. Discussion
      6. FUTURE RESEARCH DIRECTIONS
      7. CONCLUSION
      8. REFERENCES
    2. 10. Content-Based Publish/Subscribe for XML Data
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEM FOR XML DATA
        1. Matching Efficiency for Xpath Queries in the system
        2. Functionality of the system
      5. IMPROVING THE MATCHING EFFICIENCY OF THE SYSTEM
        1. Local Optimization
        2. Approaches to Share Processing
        3. Approaches to Reduce the Number of Queries
        4. Approaches to Reduce the Matching complexity
        5. Global Optimization
        6. Evaluation of the Approaches to Improve the Matching Efficiency
      6. EXTENDING THE FUNCTIONALITIES OF THE SYSTEM
        1. The Approach to Handle the Fragmented XML Data
        2. The Approach to Handle the Heterogeneous XML Data
        3. The Approach to Handle the Stateful Subscriptions
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
      9. REFERENCES
    3. 11. Content-Based XML Data Dissemination
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. Content-Based Routing
        2. Covering and Merging
      4. RELATED WORK
      5. ADVERTISEMENT-BASED ROUTING
        1. XML-Based Advertisements
        2. Non-Recursive Advertisement
        3. Recursive Advertisement
      6. COVERING AND MERGING
        1. Subscription Tree
        2. Covering Algorithm
        3. Merging
      7. EVALUATION
      8. CONCLUSION
      9. ACKNOWLEDGMENT
      10. REFERENCES
    4. 12. XP2P: A Framework for Fragmenting and Managing XML Data over Structured Peer-to-Peer Networks
      1. ABSTRACT
      2. INTRODUCTION
      3. XP2P OVERVIEW
      4. BACKGROUND
      5. XP2P FRAGMENTATION AND REPLICATION MODEL
        1. Fragmenting XML Data
        2. Replicating and updating XML Data
      6. PATH-BASED LIGHTWEIGHT DISTRIBUTED INDEXES
        1. Fingerprinting path Expressions
        2. Computing Path Fingerprints
      7. XPATH QUERIES IN XP2P
        1. Handling XML Content
        2. Exact-Match Xpath Queries
        3. Partial-Match Xpath Queries
        4. Descendant Xpath Queries
        5. The Composite Query Evaluation Algorithm
      8. EXPERIMENTAL ASSESSMENT
        1. The Experimental Framework
        2. Experimental Setup
        3. Experimental Results
      9. FUTURE RESEARCH DIRECTIONS
      10. CONCLUSION
      11. REFERENCES
      12. KEY TERMS AND DEFINITIONS
      13. ENDNOTES
  10. 4. XML Query Translation and Data Integration
    1. 13. Normalization and Translation of XQuery
      1. ABSTRACT
      2. INTRODUCTION
      3. APPROACHES TO XQUERY NORMALIZATION AND TRANSLATION
        1. Query Representations for XQuery optimization
        2. Normalization and Translation of XQuery
      4. NAL: THE NATIX A ALGEBRA FOR XQUERY OPTIMIZATION
      5. NORMALIZATION OF XQUERY
      6. SUPPORTED XQUERY FRAGMENT
      7. A NOTATION FOR NORMALIZATION RULES
        1. FLWR Expressions
        2. Xpath Expressions
        3. Example Query
        4. Restrictions
      8. TRANSLATION OF XQUERY INTO THE NATIX ALGEBA
        1. Translation Function
        2. Example Query
        3. Mapping to Calculus Representation
      9. CONCLUSION AND FUTURE RESEARCH DIRECTIONS
      10. REFERENCES
      11. ADDITIONAL READING
      12. BOOKS
      13. JOURNAL SPECIAL ISSUES
      14. DOCTORAL DISSERTATION
    2. 14. XML Data Integration: Schema Extraction and Mapping
      1. ABSTRACT
      2. INTRODUCTION
      3. APPLICATIONS
        1. Why XML?
      4. OUTLINE OF THE CHAPTER
        1. Running Example
      5. SCHEMA EXTRACTION
        1. Extraction of Tree and Graph Structures
        2. DTD and XML schema Extraction and Inference
          1. DTD Extraction
          2. XML Schema Extraction
      6. MATCHING AND MAPPING
        1. Terminology
        2. Matching operation: Identifying Mappings
          1. Structure-Based Techniques
          2. Hybrid and Composite Methods
          3. Measuring the Matching Quality
        3. Mapping Rule Generation
        4. Other considerations
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
      9. REFERENCES
    3. 15. XML Data Integration: Merging, Query Processing and Conflict Resolution
      1. ABSTRACT
      2. INTRODUCTION
      3. MERGING
        1. Handling conflicts during the Merge process
          1. Generalized Mappings
          2. Conflict-Elimination Strategies
          3. Conflict-Preserving Strategies
      4. QUERY PROCESSING
        1. Querying XML Data
        2. XML Query processing with Local sources
          1. Alternative Architectures for Query Processing with Local Sources
          2. XML Query Reformulation
          3. Source Selection and XML Query Routing
        3. Query processing over uncertain XML data
          1. Data Pre-cleaning vs. Pay-as-you-Go
          2. Data and Result Compatibility
            1. Data Compatibility Analysis
            2. Result Compatibility Analysis
            3. Data and Result Compatibility Analysis
          3. Twig Query Processing on Graphs
      5. FUTURE RESEARCH DIRECTIONS
      6. CONCLUSION
      7. REFERENCES
  11. 5. XML Semantics Utilization and Advanced Application
    1. 16. Document and Schema XML Updates
      1. ABSTRACT
      2. INTRODUCTION
      3. LANGUAGES FOR DOCUMENT UPDATES
        1. Document update operations
        2. Distinguished Features of Document Update Languages
        3. XQuery-UF
        4. XQuery!
        5. FLux
        6. Xupdate
      4. LANGUAGES FOR SCHEMA UPDATES
        1. Primitives for Schema Evolution
        2. A Graphical Language for schema evolution
        3. XSchemaUpdate
      5. EVOLUTION AND VERSIONING: IMPLICATIONS ON DOCUMENT VALIDATION, ORGANIZATION AND RETRIEVAL
        1. Impact of Document and schema evolution
        2. Impact of Document and schema versioning
      6. DBMS SUPPORT FOR UPDATES AT DOCUMENT AND SCHEMA LEVEL
        1. SQL server 2008
        2. DB2 version 9.5
        3. Oracle 11g
        4. Tamino
        5. Comparison
      7. CONCLUSION AND FUTURE TRENDS
      8. REFERENCES
    2. 17. Integration of Relational and Native Approaches to XML Query Processing
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. Twig pattern Query and Document Labeling
        2. Semantic Information on Object, Property and Value
      4. EXISTING WORK
        1. Relational Approach
        2. Native Approach
        3. Hybrid Management of Relational Data and XML
        4. A Summary on Relational Approach and Native Approach
      5. HYBRID APPROACH IN XML QUERY PROCESSING
        1. An Overview
        2. Data Structure Construction
        3. Query Processing
          1. Content Search and Query Rewriting
          2. Structural Search and Value Extraction
        4. A Summary
      6. SEMANTIC OPTIMZATIONS
        1. Optimization 1: Object/Property Table
        2. Optimization 2: Object Table
          1. Rare Property
          2. Vertical Partitioning
        3. Optimization 3: Relationship Table
        4. Summary on Semantic Optimizations
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
      9. REFERENCES
    3. 18. XML Query Evaluation in Validation and Monitoring of Web Service Interface Contracts
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
        1. An Example: The Online Trading Company
        2. Interface contracts
        3. Enforcing Interface Contracts
      4. A CASE FOR XML QUERY PROCESSING
      5. XML QUERY PROCESSING FOR TRACE VALIDATION
        1. Translation to XQuery
        2. Extensions to LTL
      6. XML QUERY PROCESSING FOR RUNTIME MONITORING
        1. Available Streaming Capabilities
        2. The Forward-only Fragment of LTL
        3. Mapping the "Until" operator
        4. Experimental Results
      7. CONCLUSION
      8. REFERENCES
      9. ENDNOTES
  12. Compilation of References
  13. About the Contributors