You are previewing Handbook of Research on Text and Web Mining Technologies.
O'Reilly logo
Handbook of Research on Text and Web Mining Technologies

Book Description

The Handbook of Research on Text and Web Mining Technologies is the first comprehensive reference to the state of research in the field of text mining. This compendium of pioneering studies is essential to academic reference collections and introduces researchers and students to cutting-edge techniques for gaining knowledge discovery from unstructured text.

Table of Contents

  1. Copyright
  2. Editorial Advisory Board
  3. List of Contributors
  4. Foreword
  5. Preface
  6. Acknowledgment
  7. About the Editors
  8. Document Preprocessing
    1. On Document Representation and Term Weights in Text Classification
      1. ABSTRACT
      2. INTRODUCTION
      3. TERMS IN DOCUMENT REPRESENTATION
      4. HOW TO COMPUTE THE TERM WEIGHTS
      5. EXPERIMENTAL STUDIES AND RESULTS
      6. CONCLUSION
      7. ACKNOWLEDGMENT
      8. REFERENCES
      9. KEY TERMS
    2. Deriving Document Keyphrases for Text Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. RELATED WORK
      4. KIP: A DOMAIN-SPECIFIC KEYPHRASE EXTRACTION ALGORITHM
      5. KIP'S LEARNING FUNCTION
      6. EXPERIMENT
      7. CONCLUSION
      8. ACKNOWLEDGMENT
      9. REFERENCES
      10. KEY TERMS
    3. Intelligent Text Mining: Putting Evolutionary Methods and Language Technologies Together
      1. ABSTRACT
      2. INTRODUCTION
      3. EVOLUTIONARY KNOWLEDGE DISCOVERY FROM TEXTS
      4. ANALYSIS AND RESULTS
      5. CONCLUSION
      6. REFERENCES
      7. ENDNOTE
  9. Classification and Clustering
    1. Automatic Syllabus Classification Using Support Vector Machines
      1. ABSTRACT
      2. INTRODUCTION
      3. CLASS DEFINITION
      4. FEATURE SELECTION
      5. TRAINING DATA PREPARATION
      6. SUPPORT VECTOR MACHINES
      7. EVALUATION
      8. RELATED WORK
      9. CONCLUSION
      10. ACKNOWLEDGMENT
      11. REFERENCES
      12. KEY TERMS
    2. Partially Supervised Text Categorization
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. THE MAIN TECHNIQUES
      5. S-EM
      6. PEBL
      7. A-EM
      8. FUTURE TRENDS AND CONCLUSION
      9. REFERENCES
      10. KEY TERMS
    3. Image Classification and Retrieval with Mining Technologies
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN THRUST
      5. FUTURE TRENDS
      6. CONCLUSION
      7. ACKNOWLEDGMENT
      8. REFERENCES
      9. KEY TERMS
    4. Improving Techniques for Naïve Bayes Text Classifiers
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. IMPROVING NAïVE BAYES TEXT CLASSIFIER (1): EM APPROACH
      5. IMPROVING NAïVE BAYES TEXT CLASSIFIER (2): BOOSTING APPROACH
      6. PERFORMANCE EVALUATION
      7. CONCLUSION
      8. REFERENCES
      9. KEY TERMS
      10. ENDNOTES
    5. Using the Text Categorization Framework for Protein Classification
      1. ABSTRACT
      2. INTRODUCTION
      3. THE PROTEIN CLASSIFICATION CONTEXT
      4. USING THE TEXT CATEGORIZATION FRAMEWORK
      5. CONCLUSION
      6. REFERENCES
      7. KEY TERMS
    6. Featureless Data Clustering
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND AND RELATED WORKS
      4. ADAPTIVE CLUSTERING USING FEATURELESS MEASURES
      5. EXPERIMENTS AND DISCUSSIONS
      6. CONCLUSION AND FUTURE TRENDS
      7. ACKNOWLEDGMENT
      8. REFERENCES
      9. KEY TERMS
    7. Swarm Intelligence in Text Document Clustering
      1. ABSTRACT
      2. INTRODUCTION
      3. PRELIMINARIES
      4. SWARM INTELLIGENCE
      5. SWARM BASED DOCUMENT CLUSTERING ALGORITHM
      6. FUTURE TRENDS
      7. ACKNOWLEDGMENT
      8. REFERENCES
      9. KEY TERMS
    8. Some Efficient and Fast Approaches to Document Clustering
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN THRUST OF THE CHAPTER
      5. FUTURE TRENDS
      6. CONCLUSION
      7. REFERENCES
      8. KEY TERMS
    9. SOM-Based Clustering of Textual Documents Using WordNet
      1. ABSTRACT
      2. INTRODUCTION
      3. STATE OF THE ART
      4. WORDNET AND CLASSIFICATION OF TEXTS
      5. PROPOSED APPROACH
      6. CONCLUSION
      7. REFERENCES
      8. KEY TERMS
    10. A Multi-Agent Neural Network System for Web Text Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. THE FRAMEWORK OF THE BPNN-BASED INTELLIGENT WEB TEXT MINING
      4. MULTI-AGENT BASED WEB TEXT MINING SYSTEM
      5. EXPERIMENT STUDY
      6. CONCLUSION
      7. FUTURE RESEARCH DIRECTIONS
      8. ACKNOWLEDGMENT
      9. REFERENCES
      10. ADDITIONAL READING
  10. Database, Ontology, and the Web
    1. Frequent Mining on XML Documents
      1. ABSTRACT
      2. INTRODUCTION
      3. FREQUENT PATTERN MINING
      4. XML FREQUENT PATTERN MINING
      5. APPLICATIONS AND ISSUSES OF XML FREQUENT MINING
      6. CONCLUSION
      7. REFERENCES
      8. KEY TERMS
      9. ENDNOTES
    2. The Process and Application of XML Data Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. REPRESENTATION OF XML DOCUMENTS
      4. XML MINING: TAXONOMY
      5. XML MINING: PROCESS
      6. COMMERCIAL USE OF XML IN DATA MINING
      7. CONCLUSION AND FUTURE DIRECTIONS
      8. REFERENCES
      9. KEY TERMS
      10. ENDNOTES
    3. Approximate Range Querying over Sliding Windows
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN FOCUS OF THE CHAPTER
      5. CONCLUSION
      6. REFERENCES
      7. KEY TERMS
    4. Slicing and Dicing a Linguistic Data Cube
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND: USING A DATA CUBE TO INTEGRATE COMPLEX SETS OF LINGUISTIC DATA
      4. PROCESSING THE INFORMATION IN A CLAUSE CUBE
      5. APPLICATION: SLICING AND DICING A CLAUSE CUBE OF GENESIS 1:1-2:3
      6. CONCLUSION
      7. REFERENCES
      8. KEY TERMS
      9. ENDNOTES
    5. Discovering Personalized Novel Knowledge from Text
      1. ABSTRACT
      2. INTRODUCTION AND BACKGROUND
      3. YOU ARE WHAT YOU READ: DERIVING USER'S BACKGROUND KNOWLEDGE
      4. KNOWLEDGE DISCOVERY
      5. NOVELTY CALCULATION
      6. EVALUATION
      7. CONCLUSION
      8. ACKNOWLEDGMENT
      9. REFERENCES
      10. KEY TERMS
    6. Untangling BioOntologies for Mining Biomedical Information
      1. ABSTRACT
      2. INTRODUCTION
      3. BIOONTOLOGIES
      4. TOWARDS AUTOMATIC ANNOTATION
      5. FUTURE PROSPECTS
      6. ACKNOWLEDGMENT
      7. REFERENCES
      8. KEY TERMS
      9. ENDNOTES
    7. Thesaurus-Based Automatic Indexing
      1. ABSTRACT
      2. INTRODUCTION
      3. THESAURI BASICS
      4. REAL WORLD THESAURI
      5. EUROVOC
      6. AGROVOC
      7. THESAURUS-BASED AUTOMATIC INDEXING
      8. REFERENCES
      9. KEY TERMS
      10. ENDNOTES
    8. Concept-Based Text Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN FOCUS OF THE CHAPTER
      5. FUTURE TRENDS
      6. REFERENCES
      7. KEY TERMS
    9. Statistical Methods for User Profiling in Web Usage Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. WEB USAGE MINING
      4. WEB MINING BRANCHES
      5. WEB USAGE MINING TASKS
      6. WEB DATA ORGANIZATION AND INFORMATION
      7. STATISTICAL TOOLS FOR PATTERN DISCOVERY AND PATTERN ANALYSIS
      8. REFERENCES
      9. KEY TERMS
    10. Web Mining to Identify People of Similar Background
      1. ABSTRACT
      2. INTRODUCTION
      3. PREVIOUS STUDIES
      4. METHODOLOGY
      5. EVALUATION METHOD AND RESULTS
      6. SUMMARY AND FUTURE RESEARCH DIRECTIONS
      7. REFERENCES
      8. KEY TERMS
    11. Hyperlink Structure Inspired by Web Usage
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. DATA MINING TO DISCOVER NAVIGATION PATTERNS
      5. CREATING AN OPTIMAL HYPERLINK STRUCTURE USING GRAPH THEORY
      6. CONCLUSION AND FUTURE RESEARCH AND DEVELOPMENT
      7. REFERENCES
      8. KEY TERMS
    12. Designing and Mining Web Applications: A Conceptual Modeling Approach
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN THRUST OF THE CHAPTER
      5. FUTURE TRENDS
      6. CONCLUSION
      7. REFERENCES
    13. Web Usage Mining for Ontology Management
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. WEB USAGE MINING FOR ONTOLOGY MANAGEMENT
      5. ILLUSTRATION IN TOURISM DOMAIN
      6. CONCLUSION
      7. FUTURE RESEARCH DIRECTIONS
      8. ACKNOWLEDGMENT
      9. REFERENCES
      10. ADDITIONAL READING
    14. A Lattice-Based Framework for Interactively and Incrementally Mining Web Traversal Patterns
      1. ABSTRACT
      2. INTRODUCTION
      3. RELATED WORK
      4. DATA STRUCTURE FOR MINING WEB TRAVERSAL PATTERNS
      5. ALGORITHM FOR INCREMENTAL WEB TRAVERSAL PATTERN MINING
      6. ALGORITHM FOR INTERACTIVE WEB TRAVERSAL PATTERN MINING
      7. EXPERIMENTAL RESULTS
      8. CONCLUSION AND FUTURE WORK
      9. ACKNOWLEDGMENT
      10. REFERENCES
    15. Privacy-Preserving Data Mining on the Web: Foundations and Techniques
      1. ABSTRACT
      2. INTRODUCTION
      3. THE BASIS OF PRIVACY-PRESERVING DATA MINING
      4. A TAXONOMY OF EXISTING PPDM TECHNIQUES
      5. REQUIREMENTS FOR TECHNICAL SOLUTIONS
      6. FUTURE RESEARCH TRENDS
      7. CONCLUSION
      8. REFERENCES
  11. Information Retrieval and Extraction
    1. Automatic Reference Tracking
      1. ABSTRACT
      2. INTRODUCTION
      3. RELATED WORK
      4. MOTIVATION
      5. DESIGN OF AUTOMATIC REFERENCE TRACKER
      6. REFERENCE DOCUMENT REPRESENTATION
      7. RESULTS
      8. LIMITATIONS AND FUTURE DIRECTIONS
      9. CONCLUSION
      10. REFERENCES
      11. KEY TERMS
    2. Determination of Unithood and Termhood for Term Recognition
      1. ABSTRACT
      2. INTRODUCTION
      3. ISSUES AND OVERVIEW OF PROPOSED SOLUTION
      4. BACKGROUND AND RELATED WORKS
      5. NEW MEASURE FOR DETERMINING UNITHOOD
      6. A NEW SCORING AND RANKING SCHEME FOR TERMHOOD
      7. EXPERIMENTS AND DISCUSSIONS
      8. CONCLUSIONS AND FUTURE TRENDS
      9. ACKNOWLEDGMENT
      10. REFERENCES
      11. KEY TERMS
    3. Retrieving Non-Latin Information in a Latin Web: The Case of Greek
      1. ABSTRACT
      2. INTRODUCTION
      3. NON-LATIN WEB SEARCHING
      4. ANALYZING A GREEK QUERY LOG
      5. RUNNING SAMPLE QUERIES
      6. IMPROVING WEB SEARCHING IN GREEK
      7. DISCUSSION
      8. REFERENCES
      9. KEY TERMS
    4. Latent Semantic Analysis and Beyond
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN FOCUS OF THE CHAPTER
      5. APPLICATIONS
      6. FUTURE TRENDS / CONCLUSION
      7. REFERENCES
      8. KEY TERMS
      9. ENDNOTES
    5. Question Answering Using Word Associations
      1. ABSTRACT
      2. 1. INTRODUCTION
      3. 2. QUESTION ANSWERING USING PROBABILISTIC LEXICO-SEMANTIC MODELS
      4. 3 A NOISY STRUCTURED QUERY SIMULATION MODEL FOR QUESTION ANSWERING
      5. REFERENCES
      6. KEY TERMS
      7. ENDNOTES
    6. The Scent of a Newsgroup: Providing Personalized Access to Usenet Sites through Web Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. INFORMATION CONTENT OF A USENET SITE
      4. MINING CONTENT OF NEWS ARTICLES
      5. MINING USAGE OF NEWS ARTICLES
      6. MINING STRUCTURE OF A NEWSGROUP
      7. PERSONALIZATION IN USENET COMMUNITIES
      8. CONCLUSION
      9. REFERENCES
  12. Application and Survey
    1. Text Mining in Program Code
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN FOCUS OF THE CHAPTER
      5. FUTURE TRENDS/CONLUSION
      6. REFERENCES
      7. KEY TERMS
    2. A Study of Friendship Networks and Blogosphere
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. SOCIAL FRIENDSHIP NETWORKS AND BLOGOSPHERE
      5. SOCIAL FRIENDSHIP NETWORKS VIS-A-VIS BLOGOSPHERE
      6. LOOKING AHEAD
      7. REFERENCES
      8. KEY TERMS
      9. ENDNOTES
    3. An HL7-Aware Decision Support System for E-Health
      1. ABSTRACT
      2. BACKGROUND
      3. MAIN THRUST OF THE CHAPTER
      4. FUTURE TRENDS
      5. CONCLUSION
      6. REFERENCES
      7. KEY TERMS
      8. ENDNOTES
    4. Multitarget Classifiers for Mining in Bioinformatics
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. MAIN FOCUS OF THE CHAPTER
      5. FUTURE TRENDS/CONCLUSION
      6. ACKNOWLEDGMENT
      7. REFERENCES
      8. KEY TERMS
    5. Current Issues and Future Analysis in Text Mining for Information Security Applications
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. TEXT MINING IN SECURITY APPLICATIONS
      5. PRIVACY ISSUES OF TEXT MINING
      6. FUTURE ANALYSIS
      7. REFERENCES
      8. KEY TERMS
    6. Collaborative Filtering Based Recommendation Systems
      1. ABSTRACT
      2. INTRODUCTION
      3. CONCEPTUAL FRAMEWORK
      4. DISTANCE MEASURES
      5. EVALUATION MEASURES
      6. COLLABORATIVE FILTERING TECHNIQUES
      7. CONCLUSIONS AND FUTURE WORK
      8. REFERENCES
      9. KEY TERMS
    7. Performance Evaluation Measures for Text Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. CLASSIFICATION MEASURES
      5. RANKING MEASURES
      6. TEXT CLUSTERING
      7. TEXT SUMMARIZATION
      8. CONCLUSION
      9. ACKNOWLEDGMENT
      10. REFERENCES
      11. KEY TERMS
    8. Text Mining in Bioinformatics: Research and Application
      1. ABSTRACT
      2. INTRODUCTION
      3. FEATURES OF TM AND BIOINFORMATICS
      4. MAIN APPLICATION
      5. RESEARCH METHOD
      6. PROBLEMS AND FUTURE WAYS
      7. REFERENCES
      8. KEY TERMS
    9. Literature Review in Computational Linguistics Issues in the Developing Field of Consumer Informatics: Finding the Right Information for Consumer's Health Information Need
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND CONCEPT: CONSUMER VOCABULARY
      4. RELATED WORK: DESIGNING A COMMUNICATION MAP BETWEEN CONSUMERS AND PROFESSIONALS
      5. DISCUSSION: TOWARDS SEMANTICS
      6. CONCLUSION
      7. REFERENCES
      8. KEY TERMS
    10. A Survey of Selected Software Technologies for Text Mining
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND OF TEXT MINING
      4. MAIN FOCUS OF THE CHAPTER: BACKGROUND OF TEXT MINING SOFTWARE
      5. RESULTS
      6. FUTURE TRENDS/CONCLUSION
      7. ACKNOWLEDGMENT
      8. REFERENCES
      9. KEY TERMS
    11. Application of Text Mining Methodologies to Health Insurance Schedules
      1. ABSTRACT
      2. AUSTRALIAN HEALTH INSURANCE SYSTEM
      3. TEXT MINING
      4. BAG OF WORDS
      5. SUPPORT VECTOR MACHINE AND KERNEL MACHINE METHODOLOGIES
      6. EXPERIMENTS USING BINARY CLASSIFICATION
      7. EXPERIMENTS USING MULTIPLE CLASSIFICATIONS
      8. EXPERIMENTS WITH CLUSTERING ALGORITHMS
      9. CONCLUSION
      10. ACKNOWLEDGMENT
      11. REFERENCES
      12. ENDNOTES
    12. Web Mining System for Mobile-Phone Marketing
      1. ABSTRACT
      2. LITERATURE REVIEW
      3. THE WEB MINING SYSTEM
      4. A CASE STUDY OF THE WEB MINING SYSTEM
      5. SUMMARY AND DISCUSSION
      6. REFERENCES
      7. ENDNOTE
    13. Web Service Architectures for Text Mining: An Exploration of the Issues via an E-Science Demonstrator
      1. ABSTRACT
      2. INTRODUCTION
      3. POSSIBLE WEB SERVICES ARCHITECTURES
      4. CORE TEXT SERVICES IMPLEMENTATION
      5. DISCUSSION
      6. CONCLUSION
      7. REFERENCES
      8. ENDNOTE
    14. APPENDIX
  13. About the Contributors
  14. Index