You are previewing Big Data Management, Technologies, and Applications.
O'Reilly logo
Big Data Management, Technologies, and Applications

Book Description

Due to the tremendous amount of data generated daily from fields such as business, research, and sciences, big data is everywhere. Therefore, alternative management and processing methods have to be created to handle this complex and unstructured data size. Big Data Management, Technologies, and Applications discusses the exponential growth of information size and the innovative methods for data capture, storage, sharing, and analysis for big data. With its prevalence, this collection of articles on big data methodologies and technologies are beneficial for IT workers, researchers, students, and practitioners in this timely field.

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright Page
  4. Book Series
  5. Editorial Advisory Board and List of Reviewers
    1. Editorial Advisory Board
    2. List of Reviewers
  6. Foreword
  7. Preface
    1. INTRODUCTION
    2. ESSENTIAL BIG DATA MANAGEMENT, TECHNOLOGIES, AND APPLICATIONS
    3. ORGANIZATION OF THE BOOK
    4. SUMMARY
  8. Acknowledgment
  9. Section 1: Big Data Technologies, Methods, and Algorithms
    1. Chapter 1: Technologies for Big Data
      1. ABSTRACT
      2. INTRODUCTION: THE CHALLENGE OF BIG DATA
      3. MAPREDUCE AND HADOOP DISTRIBUTED FILE SYSTEM
      4. NoSQL DATABASE SYSTEMS
      5. CONCLUSION
    2. Chapter 2: Applying the K-Means Algorithm in Big Raw Data Sets with Hadoop and MapReduce
      1. ABSTRACT
      2. INTRODUCTION
      3. RELATED WORK
      4. HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
      5. DATA MINING
      6. SIMULATIONS
      7. COMPLEXITY AND COMPARISONS
      8. DISCUSSION AND CONCLUSION
    3. Chapter 3: Synchronizing Execution of Big Data in Distributed and Parallelized Environments
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. PROBLEMS IN DEALING WITH BIG DATA OVER DISTRIBUTED SYSTEMS
      5. SYNCHRONOUS PARALLELIZATION THROUGH LOAD BALANCING
      6. CASE STUDY
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
    4. Chapter 4: Parallel Data Reduction Techniques for Big Datasets
      1. ABSTRACT
      2. INTRODUCTION
      3. GENERAL REDUCTION TECHNIQUES
      4. PARALLEL DATA REDUCTION ALGORITHMS
      5. CONCLUSION
  10. Section 2: Big Data Storage, Management, and Sharing
    1. Chapter 5: Techniques for Sampling Online Text-Based Data Sets
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. SAMPLING AND BIG DATA SETS
      5. FUTURE RESEARCH DIRECTIONS
      6. CONCLUSION
    2. Chapter 6: Big Data Warehouse Automatic Design Methodology
      1. ABSTRACT
      2. 1. INTRODUCTION
      3. 2. RELATED WORK
      4. 3. METHODOLOGY
      5. 4. REQUIREMENT ANALYSIS
      6. 5. SOURCE ANALYSIS AND INTEGRATION
      7. 6. CONCEPTUAL DESIGN
      8. 7. CASE STUDY
      9. 8. FUTURE RESEARCH
      10. 9. CONCLUSION
    3. Chapter 7: Big Data Management in the Context of Real-Time Data Warehousing
      1. ABSTRACT
      2. INTRODUCTION
      3. RELATED WORK
      4. EXISTING MESHJOIN
      5. CACHEJOIN
      6. PERFORMANCE EXPERIMENTS
      7. SUMMARY
    4. Chapter 8: Big Data Sharing Among Academics
      1. ABSTRACT
      2. INTRODUCTION
      3. DATA SHARING PRACTICES AND TRENDS
      4. CASE STUDIES: DISCIPLINARY REPOSITORIES
      5. FUTURE RESEARCH DIRECTIONS AND CONCLUSION
  11. Section 3: Specific Big Data
    1. Chapter 9: Scalable Data Mining, Archiving, and Big Data Management for the Next Generation Astronomical Telescopes
      1. ABSTRACT
      2. 1. INTRODUCTION
      3. 2. SCIENCE AND BIG DATA CHALLENGES IN NEXT GENERATION ASTRONOMICAL INSTRUMENTS
      4. 3. BIGDATA TECHNOLOGIES FROM THE APACHE SOFTWARE FOUNDATION
      5. 4. MAIN FOCUS OF THE CHAPTER
      6. 5. FUTURE RESEARCH DIRECTIONS AND CONCLUSION
    2. Chapter 10: Efficient Metaheuristic Approaches for Exploration of Online Social Networks
      1. ABSTRACT
      2. 1. INTRODUCTION
      3. 2. BIG DATA ANALYTIC IN SOCIAL NETWORKS
      4. 3. MATHEMATICAL MODELS
      5. 4. PROPOSED METAHEURISTIC METHODS
      6. 5. COMPUTATIONAL RESULTS
      7. 6. CONCLUSION
    3. Chapter 11: Big Data at Scale for Digital Humanities
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. HATHITRUST RESEACH CENTER
      5. DISCOVERY AND ACCESS
      6. SERVICES MANAGEMENT
      7. DATA MANAGEMENT
      8. FUTURE RESEARCH DIRECTIONS
      9. CONCLUSION
    4. Chapter 12: GeoBase
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. PROBLEMS IN END-TO-END ANALYSIS
      5. GEOBASE
      6. EXPERIMENTS
      7. FUTURE RESEARCH DIRECTIONS
      8. CONCLUSION
    5. Chapter 13: Large-Scale Sensor Network Analysis
      1. ABSTRACT
      2. 1. INTRODUCTION
      3. 2. PRELIMINARIES
      4. 3. SUBSEQUENCE CLUSTERING
      5. 4. MULTI-SCALE ANALYSIS
      6. 5. MODELING DEPENDENCIES
      7. 6. CONCLUSION
  12. Section 4: Big Data and Computer Systems and Big Data Benchmarks
    1. Chapter 14: Accelerating Large-Scale Genome-Wide Association Studies with Graphics Processors
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. DESIGN AND IMPLEMENTATION
      5. FUTURE RESEARCH DIRECTIONS
      6. CONCLUSION
    2. Chapter 15: The Need to Consider Hardware Selection when Designing Big Data Applications Supported by Metadata
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND
      4. BUSINESS RULES AND METADATA STORES
      5. FUTURE RESEARCH DIRECTIONS
      6. CONCLUSION
    3. Chapter 16: Excess Entropy in Computer Systems
      1. ABSTRACT
      2. INTRODUCTION
      3. BACKGROUND: ENTROPY, IMBALANCE AND CONCENTRATION
      4. APPLICATION 1: IMBALANCE IN RESOURCE USE BY WINDOWS AZURE STORAGE ACCOUNTS
      5. THE INTERPRETATION OF NORMALIZED EXCESS ENTROPY VALUES
      6. THE MEANING AND RAMIFICATIONS OF HIGH EXCESS ENTROPY VALUES
      7. COMPOSITE EXCESS ENTROPY: A SYSTEM AND ITS SUBSYSTEMS
      8. SUMMARY
    4. Chapter 17: A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads
      1. ABSTRACT
      2. 1. INTRODUCTION TO SYSTEM BENCHMARKS
      3. 2. INDUSTRY STANDARD BENCHMARKS
      4. 3. APPLICATION BENCHMARKS
      5. 4. SYNTHETIC WORKLOADS
      6. 5. CHANGING INDUSTRY LANDSCAPE
      7. 6. CONCLUSION
  13. Compilation of References
  14. About the Contributors