You are previewing Big Crisis Data.
O'Reilly logo
Big Crisis Data

Book Description

Social media is an invaluable source of time-critical information during a crisis. However, emergency response and humanitarian relief organizations that would like to use this information struggle with an avalanche of social media messages that exceeds human capacity to process. Emergency managers, decision makers, and affected communities can make sense of social media through a combination of machine computation and human compassion - expressed by thousands of digital volunteers who publish, process, and summarize potentially life-saving information. This book brings together computational methods from many disciplines: natural language processing, semantic technologies, data mining, machine learning, network analysis, human-computer interaction, and information visualization, focusing on methods that are commonly used for processing social media messages under time-critical constraints, and offering more than 500 references to in-depth information.

Table of Contents

  1. Cover
  2. Half title
  3. Dedication
  4. Title
  5. Copyright
  6. Table of Contents
  7. Preface
  8. Acknowledgments
  9. 1 Introduction
    1. 1.1 “Sirens going off now!! Take cover … be safe!”
    2. 1.2 What Is a Disaster?
    3. 1.3 Information Flows in Social Media
    4. 1.4 The Data Deluge
    5. 1.5 Requirements: “Big Picture” Versus “Actionable Insights”
    6. 1.6 Organizational Challenges
    7. 1.7 Scope and Organization of This Book
    8. 1.8 Further Reading and Online Appendix
  10. 2 Volume: Data Acquisition, Storage, and Retrieval
    1. 2.1 Social Media Data Sizes
    2. 2.2 Data Acquisition
    3. 2.3 Postfiltering and De-Duplication
    4. 2.4 Data Representation / Feature Extraction
    5. 2.5 Storage and Indexing
    6. 2.6 Research Problems
    7. 2.7 Further Reading
  11. 3 Vagueness: Natural Language and Semantics
    1. 3.1 Social Media Is Conversational
    2. 3.2 Text Preprocessing
    3. 3.3 Sentiment Analysis
    4. 3.4 Named Entities
    5. 3.5 Geotagging and Geocoding
    6. 3.6 Extracting Structured Information
    7. 3.7 Ontologies for Explicit Semantics
    8. 3.8 Research Problems
    9. 3.9 Further Reading
  12. 4 Variety: Classification and Clustering
    1. 4.1 Content Categories
    2. 4.2 Supervised Classification
    3. 4.3 Unsupervised Classification / Clustering
    4. 4.4 Research Problems
    5. 4.5 Further Reading
  13. 5 Virality: Networks and Information Propagation
    1. 5.1 Crisis Information Networks
    2. 5.2 Cascading of Crisis Information
    3. 5.3 User Communities and User Roles
    4. 5.4 Research Problems
    5. 5.5 Further Reading
  14. 6 Velocity: Online Methods and Data Streams
    1. 6.1 Stream Processing
    2. 6.2 Analyzing Temporal Data
    3. 6.3 Event Detection
    4. 6.4 Event-Detection Methods
    5. 6.5 Incremental Update Summarization
    6. 6.6 Domain-Specific Approaches
    7. 6.7 Research Problems
    8. 6.8 Further Reading
  15. 7 Volunteers: Humanitarian Crowdsourcing
    1. 7.1 Digital Volunteering
    2. 7.2 Organized Digital Volunteering
    3. 7.3 Motivating Volunteers
    4. 7.4 Digital Volunteering Tasks
    5. 7.5 Hybrid Systems
    6. 7.6 Research Problems
    7. 7.7 Further Reading
  16. 8 Veracity: Misinformation and Credibility
    1. 8.1 Emergencies, Media, and False Information
    2. 8.2 Policy-Based Trust and Social Media
    3. 8.3 Misinformation and Disinformation
    4. 8.4 Verification Practices
    5. 8.5 Automatic Credibility Analysis
    6. 8.6 Research Problems
    7. 8.7 Further Reading
  17. 9 Validity: Biases and Pitfalls of Social Media Data
    1. 9.1 Studying the “Offline” World Using “Online” Data
    2. 9.2 The Digital Divide
    3. 9.3 Content Production Issues
    4. 9.4 Infrastructure and Technological Factors
    5. 9.5 The Geography of Events and Geotagged Social Media
    6. 9.6 Evaluation of Alerts Triggered from Social Media
    7. 9.7 Research Problems
    8. 9.8 Further Reading
  18. 10 Visualization: Crisis Maps and Beyond
    1. 10.1 Crisis Maps
    2. 10.2 Crisis Dashboards
    3. 10.3 Interactivity
    4. 10.4 Research Problems
    5. 10.5 Further Reading
  19. 11 Values: Privacy and Ethics
    1. 11.1 Protecting the Privacy of Individuals
    2. 11.2 Intentional Human-Induced Disasters
    3. 11.3 Protecting Citizen Reporters and Digital Volunteers
    4. 11.4 Ethical Experimentation
    5. 11.5 Giving Back and Sharing Data
    6. 11.6 Research Problems
    7. 11.7 Further Reading
  20. 12 Conclusions and Outlook
    1. 12.1 The Quality of Crisis Information
    2. 12.2 Peer Production of Crisis Information
    3. 12.3 Technologies for Crisis Communications in Social Media
    4. 12.4 User-Generated Images, Video, and Aerial Photography
    5. 12.5 Outlook
  21. Bibliography
  22. Index
  23. Terms and Acronyms