You are previewing XML Data Mining.

XML Data Mining

Cover of XML Data Mining by Andrea Tagarelli Published by IGI Global
  1. Cover
  2. Title Page
  3. Copyright Page
  4. Editorial Advisory Board and List of Reviewers
    1. Editorial Advisory Board
    2. List of Reviewers
  5. Foreword
  6. Preface
    1. Objectives and Mission
    2. Prospective Audience and Potential Uses
    3. Organization of the Book
  7. Acknowledgment
  8. Section 1: Models and Measures
    1. Chapter 1: A Study of XML Models for Data Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. DATA MODELS FOR XML DOCUMENT MINING
      4. DATA MINING TASKS USING THE VARIOUS MODELS
      5. CURRENT ISSUES AND CHALLENGES IN MODELLING XML DOCUMENTS FOR MINING
      6. FUTURE MODELS OF XML DOCUMENTS AND THEIR OPPORTUNITIES
      7. CONCLUSION
    2. Chapter 2: Modeling, Querying, and Mining Uncertain XML Data
      1. Abstract
      2. Introduction
      3. Models of Uncertainty
      4. Probabilistic XML
      5. Mining Information from Probabilistic XML
      6. Conclusion and Future Research
    3. Chapter 3: XML Similarity Detection and Measures
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. SIMILARITY MEASURES
      5. XML SIMILARITY
    4. Chapter 4: Efficient Identification of Similar XML Fragments Based on Tree Edit Distance
      1. Abstract
      2. INTRODUCTION
      3. RELATED WORK
      4. Background
      5. A Randomized Structure for THE Estimation of Tree Edit Distance
      6. Similarity DeTECTION Algorithms for XML Fragments
      7. Similarity detection Algorithms for XML Fragment sets
      8. COMPUTATIONAL ANALYSIS
      9. Experimental Results
      10. Future Research Directions
      11. Conclusion
  9. Section 2: Clustering and Classification
    1. Chapter 5: Approximate Matching Between XML Documents and Schemas with Applications in XML Classification and Clustering
      1. Abstract
      2. INTRODUCTION
      3. Background
      4. Tree Matching Algorithms
      5. Applications in XML Document Classification/Clustering
      6. ExperimentaL Evaluation
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
    2. Chapter 6: The Role of Schema and Document Matchings in XML Source Clustering
      1. Abstract
      2. INTRODUCTION
      3. Computing Similarities between XML Schemas/dtds
      4. Computing Similarities between XML Documents
      5. CLUSTERING DTDs and XML schemas
      6. CLUSTERING XML DOCUMENTS
      7. FURTHER RESEARCH DIRECTIONS
      8. CONCLUSION
    3. Chapter 7: XML Document Clustering
      1. Abstract
      2. INTRODUCTION
      3. STRUCTURE-BASED XML CLUSTERING
      4. CONTENT-BASED AND HYBRID XML CLUSTERING
      5. SEMANTIC XML CLUSTERING
      6. APPLICATION DOMAINS
      7. CONCLUSION
    4. Chapter 8: Fuzzy Approaches to Clustering XML Structures
      1. Abstract
      2. INTRODUCTION
      3. BACKGROUND
      4. STRUCTURE ENCODING and similarity
      5. Fuzzy clustering algorithms
      6. Fuzzy clustering of XML structures
      7. Future Research Directions
      8. Conclusion
    5. Chapter 9: XML Tree Classification on Evolving Data Streams1
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. XML Tree Classification Framework on Data Streams
      5. Experimental Evaluation
      6. CONCLUSION AND FUTURE WORK
    6. Chapter 10: Data Driven Encoding of Structures and Link Predictions in Large XML Document Collections
      1. ABSTRACT
      2. INTRODUCTION
      3. SELF-ORGANIZING MAP FOR STRUCTURES
      4. INCOMING AND OUTGOING LINK PREDICTION IN XML DOCUMENTS
      5. EXPERIMENTAL RESULTS
      6. FUTURE RESEARCH DIRECTIONS
      7. CONCLUSION
  10. Section 3: Association Mining
    1. Chapter 11: Frequent Pattern Discovery and Association Rule Mining of XML Data
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MINING FREQUENT PATTERNS AND ASSOCIATION RULES ON XML DATA
      5. FUTURE RESEARCH DIRECTIONS
      6. CONCLUSION
    2. Chapter 12: A Framework for Mining and Querying Summarized XML Data through Tree-Based Association Rules
      1. Abstract
      2. INTRODUCTION
      3. Background
      4. managing tree-based association rules
      5. The TreeRuler prototype applied to the Odyssey scenario
      6. Conclusion
    3. Chapter 13: Discovering Higher Level Correlations from XML Data
      1. Abstract
      2. INTRODUCTION
      3. RELATED WORK
      4. DEFINITIONS AND NOTATION
      5. The xml-GERMI framework
      6. EXPERIMENTAL EVALUATION
      7. FuTURE rESEARCH dIRECTIONS
      8. Conclusion
  11. Section 4: Semantics-Aware Mining
    1. Chapter 14: XML Mining for Semantic Web
      1. Abstract
      2. INTRODUCTION
      3. INCORPORATING BACKGROUND KNOWLEDGE TO DATA MINING PROCESSES
      4. THE SEMANTIC WEB SCENARIO
      5. TOWARDS A SEMANTICS-AWARE KNOWLEDGE DISCOVERY FRAMEWORK
      6. SEMANTICS-AWARE MINING OF COMPLEX DATA SOURCES
      7. Future Research Directions
      8. Conclusion
    2. Chapter 15: A Component-Based Framework for the Integration and Exploration of XML Sources
      1. Abstract
      2. INTRODUCTION
      3. architecture of the proposed framework
      4. A POSSIBLE IMPLEMENTATION OF THE IIPE COMPONENT
      5. A POSSIBLE IMPLEMENTATION OF THE ISIM COMPONENT
      6. A POSSIBLE IMPLEMENTATION OF THE SCM COMPONENT
      7. RELATED WORK
      8. CONCLUSION and FUTURE WORKS
    3. Chapter 16: Matching XML Documents at Structural and Conceptual Level using Subtree Patterns
      1. Abstract
      2. Introduction
      3. Background
      4. Problem Definition
      5. Motivation of the Proposed Approach
      6. Method Description
      7. Experimental Evaluation
      8. Future Research Directions
      9. Conclusion
  12. Section 5: Applications
    1. Chapter 17: Geographical Map Annotation with Significant Tags available from Social Networks
      1. Abstract
      2. INTRODUCTION
      3. BACKGROUND
      4. OPENSTREETMAP
      5. RELATED WORK
      6. MAP ANNOTATION WITH TAGS FROM SOCIAL NETWORK APPLICATIONS
      7. CASE STUDIES
      8. Future Research Directions
      9. Conclusion
    2. Chapter 18: Organizing XML Documents on a Peer–to–Peer Network by Collaborative Clustering
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. XML TRANSACTIONAL REPRESENTATION
      5. COLLABORATIVE CLUSTERING OF XML DOCUMENTS
      6. EXPERIMENTAL EVALUATION
      7. CONCLUSION AND FUTURE RESEARCH
    3. Chapter 19: Incorporating Qualitative Information for Credit Risk Assessment through Frequent Subtree Mining for XML
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. STUDIES OF RELEVANT WORKS
      5. PROPOSED METHOD
      6. PRELIMINARY EXPERIMENTS
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
  13. About the Contributors
  14. Index
O'Reilly logo

Chapter 1

A Study of XML Models for Data Mining:

Representations, Methods, and Issues

Sangeetha Kutty

Queensland University of Technology, Australia

Richi Nayak

Queensland University of Technology, Australia

Tien Tran

Queensland University of Technology, Australia

ABSTRACT

With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques can be used to derive this interesting information. However, mining of XML documents is impacted by the data model used in data representation due to the semi-structured nature of these documents. In this chapter, we present an overview of the various models of XML documents representations, how these ...

The best content for your career. Discover unlimited learning on demand for around $1/day.