You are previewing Data Science For Dummies.
O'Reilly logo
Data Science For Dummies

Book Description

Discover how data science can help you gain in-depth insight into your business - the easy way!

Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles in organizations. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you'll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization.

  • Provides a background in data science fundamentals before moving on to working with relational databases and unstructured data and preparing your data for analysis

  • Details different data visualization techniques that can be used to showcase and summarize your data

  • Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques

  • Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark

  • It's a big, big data world out there - let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

    Table of Contents

      1. Cover
      2. Foreword
      3. Introduction
        1. About This Book
        2. Foolish Assumptions
        3. Icons Used in This Book
        4. Beyond the Book
        5. Where to Go from Here
      4. Part I: Getting Started With Data Science
        1. Chapter 1: Wrapping Your Head around Data Science
          1. Seeing Who Can Make Use of Data Science
          2. Looking at the Pieces of the Data Science Puzzle
          3. Getting a Basic Lay of the Data Science Landscape
        2. Chapter 2: Exploring Data Engineering Pipelines and Infrastructure
          1. Defining Big Data by Its Four Vs
          2. Identifying Big Data Sources
          3. Grasping the Difference between Data Science and Data Engineering
          4. Boiling Down Data with MapReduce and Hadoop
          5. Identifying Alternative Big Data Solutions
          6. Data Engineering in Action — A Case Study
        3. Chapter 3: Applying Data Science to Business and Industry
          1. Incorporating Data-Driven Insights into the Business Process
          2. Distinguishing Business Intelligence and Data Science
          3. Knowing Who to Call to Get the Job Done Right
          4. Exploring Data Science in Business: A Data-Driven Business Success Story
      5. Part II: Using Data Science to Extract Meaning from Your Data
        1. Chapter 4: Introducing Probability and Statistics
          1. Introducing the Fundamental Concepts of Probability
          2. Introducing Linear Regression
          3. Simulations
          4. Introducing Time Series Analysis
        2. Chapter 5: Clustering and Classification
          1. Introducing the Basics of Clustering and Classification
          2. Identifying Clusters in Your Data
        3. Chapter 6: Clustering and Classification with Nearest Neighbor Algorithms
          1. Making Sense of Data with Nearest Neighbor Analysis
          2. Seeing the Importance of Clustering and Classification
          3. Classifying Data with Average Nearest Neighbor Algorithms
          4. Classifying with K-Nearest Neighbor Algorithms
          5. Using Nearest Neighbor Distances to Infer Meaning from Point Patterns
          6. Solving Real-World Problems with Nearest Neighbor Algorithms
        4. Chapter 7: Mathematical Modeling in Data Science
          1. Introducing Multi-Criteria Decision Making (MCDM)
          2. Using Numerical Methods in Data Science
          3. Mathematical Modeling with Markov Chains and Stochastic Methods
        5. Chapter 8: Modeling Spatial Data with Statistics
          1. Generating Predictive Surfaces from Spatial Point Data
          2. Using Trend Surface Analysis on Spatial Data
      6. Part III: Creating Data Visualizations that Clearly Communicate Meaning
        1. Chapter 9: Following the Principles of Data Visualization Design
          1. Understanding the Types of Visualizations
          2. Focusing on Your Audience
          3. Picking the Most Appropriate Design Style
          4. Knowing When to Add Context
          5. Knowing When to Get Persuasive
          6. Choosing the Most Appropriate Data Graphic Type
          7. Choosing Your Data Graphic
        2. Chapter 10: Using D3.js for Data Visualization
          1. Introducing the D3.js Library
          2. Knowing When to Use D3.js (and When Not To)
          3. Getting Started in D3.js
          4. Understanding More Advanced Concepts and Practices in D3.js
        3. Chapter 11: Web-Based Applications for Visualization Design
          1. Using Collaborative Data Visualization Platforms
          2. Visualizing Spatial Data with Online Geographic Tools
          3. Visualizing with Open Source: Web-Based Data Visualization Platforms
          4. Knowing When to Stick with Infographics
        4. Chapter 12: Exploring Best Practices in Dashboard Design
          1. Focusing on the Audience
          2. Starting with the Big Picture
          3. Getting the Details Right
          4. Testing Your Design
        5. Chapter 13: Making Maps from Spatial Data
          1. Getting into the Basics of GIS
          2. Analyzing Spatial Data
          3. Getting Started with Open-Source QGIS
      7. Part IV: Computing for Data Science
        1. Chapter 14: Using Python for Data Science
          1. Understanding Basic Concepts in Python
          2. Getting on a First-Name Basis with Some Useful Python Libraries
          3. Using Python to Analyze Data — An Example Exercise
        2. Chapter 15: Using Open Source R for Data Science
          1. Introducing the Fundamental Concepts
          2. Previewing R Packages
        3. Chapter 16: Using SQL in Data Science
          1. Getting Started with SQL
          2. Using SQL and Its Functions in Data Science
        4. Chapter 17: Software Applications for Data Science
          1. Making Life Easier with Excel
          2. Using KNIME for Advanced Data Analytics
      8. Part V: Applying Domain Expertise to Solve Real-World Problems Using Data Science
        1. Chapter 18: Using Data Science in Journalism
          1. Exploring the Five Ws and an H
          2. Collecting Data for Your Story
          3. Finding and Telling Your Data’s Story
          4. Bringing Data Journalism to Life: Washington Post’s The Black Budget
        2. Chapter 19: Delving into Environmental Data Science
          1. Modeling Environmental-Human Interactions with Environmental Intelligence
          2. Modeling Natural Resources in the Raw
          3. Using Spatial Statistics to Predict for Environmental Variation across Space
        3. Chapter 20: Data Science for Driving Growth in E-Commerce
          1. Making Sense of Data for E-Commerce Growth
          2. Optimizing E-Commerce Business Systems
        4. Chapter 21: Using Data Science to Describe and Predict Criminal Activity
          1. Temporal Analysis for Crime Prevention and Monitoring
          2. Spatial Crime Prediction and Monitoring
          3. Probing the Problems with Data Science for Crime Analysis
      9. Part VI: The Part of Tens
        1. Chapter 22: Ten Phenomenal Resources for Open Data
          1. Digging through Data.gov
          2. Checking Out Canada Open Data
          3. Diving into data.gov.uk
          4. Checking Out U.S. Census Bureau Data
          5. Knowing NASA Data
          6. Wrangling World Bank Data
          7. Getting to Know Knoema Data
          8. Queuing Up with Quandl Data
          9. Exploring Exversion Data
          10. Mapping OpenStreetMap Spatial Data
        2. Chapter 23: Ten (or So) Free Data Science Tools and Applications
          1. Making Custom Web-Based Data Visualizations with Free R Packages
          2. Checking Out More Scraping, Collecting, and Handling Tools
          3. Checking Out More Data Exploration Tools
          4. Checking Out More Web-Based Visualization Tools
      10. About the Author
      11. Cheat Sheet
      12. Advertisement Page
      13. Connect with Dummies
      14. End User License Agreement