You are previewing Data Mining For Dummies.
O'Reilly logo
Data Mining For Dummies

Book Description

Delve into your data for the key to success

Data mining is quickly becoming integral to creating value and business momentum. The ability to detect unseen patterns hidden in the numbers exhaustively generated by day-to-day operations allows savvy decision-makers to exploit every tool at their disposal in the pursuit of better business. By creating models and testing whether patterns hold up, it is possible to discover new intelligence that could change your business's entire paradigm for a more successful outcome.

Data Mining for Dummies shows you why it doesn't take a data scientist to gain this advantage, and empowers average business people to start shaping a process relevant to their business's needs. In this book, you'll learn the hows and whys of mining to the depths of your data, and how to make the case for heavier investment into data mining capabilities. The book explains the details of the knowledge discovery process including:

  • Model creation, validity testing, and interpretation

  • Effective communication of findings

  • Available tools, both paid and open-source

  • Data selection, transformation, and evaluation

  • Data Mining for Dummies takes you step-by-step through a real-world data-mining project using open-source tools that allow you to get immediate hands-on experience working with large amounts of data. You'll gain the confidence you need to start making data mining practices a routine part of your successful business. If you're serious about doing everything you can to push your company to the top, Data Mining for Dummies is your ticket to effective data mining.

    Table of Contents

      1. Introduction
        1. About This Book
        2. Foolish Assumptions
        3. Icons Used in This Book
        4. Beyond the Book
        5. Where to Go from Here
      2. Part I: Getting Started with Data Mining
        1. Chapter 1: Catching the Data-Mining Train
          1. Getting Real about Data Mining
            1. Not your professor’s statistics
            2. The value of data mining
            3. Working for it
          2. Doing What Data Miners Do
            1. Focusing on the business
            2. Understanding how data miners spend their time
            3. Getting to know the data-mining process
            4. Making models
            5. Understanding mathematical models
            6. Putting information into action
          3. Discovering Tools and Methods
            1. Visual programming
            2. Working quick and dirty
            3. Testing, testing, and testing some more
        2. Chapter 2: A Day in Your Life as a Data Miner
          1. Starting Your Day Off Right
            1. Meeting the team
            2. Exploring with aim
            3. Structuring time with the right process
          2. Understanding Your Business Goals
          3. Understanding Your Data
            1. Describing data
            2. Exploring data
            3. Cleaning data
          4. Preparing Your Data
            1. Taking first steps with the property data
            2. Preparing the ownership change indicator
            3. Merging the datasets
            4. Deriving new variables
          5. Modeling Your Data
            1. Using balanced data
            2. Splitting data
            3. Building a model
          6. Evaluating Your Results
            1. Examining the decision tree
            2. Using a diagnostic chart
            3. Assessing the status of the model
          7. Putting Your Results into Action
        3. Chapter 3: Teaming Up to Reach Your Goals
          1. Nothing Could Be Finer Than to Be a Data Miner
            1. You can be a data miner
            2. Using the knowledge you have
          2. Data Miners Play Nicely with Others
            1. Cooperation is a necessity
            2. Oh, the people you’ll meet!
          3. Working with Executives
            1. Greetings and elicitations
            2. Lining up your priorities
            3. Talking data mining with executives
      3. Part II: Exploring Data-Mining Mantras and Methods
        1. Chapter 4: Learning the Laws of Data Mining
          1. 1st Law: Business Goals
          2. 2nd Law: Business Knowledge
          3. 3rd Law: Data Preparation
          4. 4th Law: Right Model
          5. 5th Law: Pattern
          6. 6th Law: Amplification
          7. 7th Law: Prediction
          8. 8th Law: Value
          9. 9th Law: Change
        2. Chapter 5: Embracing the Data-Mining Process
          1. Whose Standard Is It, Anyway?
            1. Approaching the process in phases
            2. Cycling through phases and projects
            3. Documenting your work
          2. Business Understanding
          3. Data Understanding
          4. Data Preparation
          5. Modeling
          6. Evaluation
          7. Deployment
        3. Chapter 6: Planning for Data-Mining Success
          1. Setting the Course with Formal Business Cases
            1. Satisfying the boss
            2. Minimizing your own risk
          2. Building Business Cases
            1. Elements of the business case
            2. Putting it in writing
            3. The basics on benefits
          3. Avoiding the Failure Option
        4. Chapter 7: Gearing Up with the Right Software
          1. Putting Data-Mining Tools in Perspective
            1. Avoiding software risks
            2. Focusing on business goals, not tools
            3. Determining what you need
            4. Comparing tools
            5. Shopping for software
          2. Evaluating Software
            1. Don’t fall in love (with your software)
            2. Engaging with sales representatives
            3. The sales professional’s mantra — BANT
      4. Part III: Gathering the Raw Materials
        1. Chapter 8: Digging into Your Data
          1. Focusing on a Problem
          2. Managing Scope
          3. Using Your Organization’s Own Data
            1. Appreciating your own data
            2. Handling data with respect
        2. Chapter 9: Making New Data
          1. Fathoming Loyalty Programs
            1. Grasping the loyalty concept
            2. Your data bonanza
            3. Putting loyalty data to work
          2. Testing, Testing . . .
            1. Experimenting in direct marketing
            2. Spying test opportunities
            3. Testing online
          3. Microtargeting to Win Elections
            1. Treating voters as individuals
            2. Looking at an example
            3. Enhancing voter data
            4. Gaining an information advantage
            5. Developing your own test data
            6. Taking discoveries on the campaign trail
          4. Surveying the Public Landscape
            1. Eliciting information with surveys
            2. Using surveys
            3. Developing questions
            4. Conducting surveys
            5. Recognizing limitations
            6. Bringing in help
          5. Getting into the Field
            1. Going where no data miner has gone before
            2. Doing more than asking
          6. One Challenge, Many Approaches
        3. Chapter 10: Ferreting Out Public Data Sources
          1. Looking Over the Lay of the Land
          2. Exploring Public Data Sources
            1. United States federal government
            2. Governments around the world
            3. United States state and local governments
        4. Chapter 11: Buying Data
          1. Peeking at Consumer Data
          2. Beyond Consumer Data
          3. Desperately Seeking Sources
          4. Assessing Quality and Suitability
      5. Part IV: A Data Miner’s Survival Kit
        1. Chapter 12: Getting Familiar with Your Data
          1. Organizing Data for Mining
          2. Getting Data from There to Here
            1. Text files
            2. Databases
            3. Spreadsheets, XML, and specialty data formats
          3. Surveying Your Data
        2. Chapter 13: Dealing in Graphic Detail
          1. Starting Simple
            1. Eyeballing variables with bar charts and histograms
            2. Relating one variable to another with scatterplots
          2. Building on Basics
            1. Making scatterplots say more
            2. Interacting with scatterplots
          3. Working Fast with Graphs Galore
          4. Extending Your Graphics Range
        3. Chapter 14: Showing Your Data Who’s Boss
          1. Rearranging Data
            1. Controlling variable order
            2. Formatting data properly
            3. Labeling data
            4. Controlling case order
            5. Getting rows and columns right
            6. Putting data where you need it
          2. Sifting Out the Data You Need
            1. Narrowing the fields
            2. Selecting relevant cases
            3. Sampling
          3. Getting the Data Together
            1. Merging
            2. Appending
          4. Making New Data from Old Data
            1. Deriving new variables
            2. Aggregation
          5. Saving Time
        4. Chapter 15: Your Exciting Career in Modeling
          1. Grasping Modeling Concepts
          2. Cultivating Decision Trees
            1. Examining a decision tree
            2. Using decision trees to aid communication
            3. Constructing a decision tree
            4. Getting acquainted with common decision tree types
            5. Adapting to your tools
          3. Neural Networks for Prediction
            1. Looking inside a neural network
            2. Issues surrounding neural network models
          4. Clustering
            1. Supervised and unsupervised learning
            2. Clustering to clarify
      6. Part V: More Data-Mining Methods
        1. Chapter 16: Data Mining Using Classic Statistical Methods
          1. Understanding Correlation
            1. Picturing correlations
            2. Measuring the strength of a correlation
            3. Drawing lines in the data
            4. Giving correlations a try
          2. Understanding Linear Regression
            1. Working with straight lines
            2. Finding the best line
            3. Using linear regression coefficients
            4. Interpreting model statistics
            5. Applying common sense
          3. Understanding Logistic Regression
            1. Looking into logistic regression
            2. Appreciating the appeal of logistic regression
            3. Looking over a logistic regression example
        2. Chapter 17: Mining Data for Clues
          1. Tracking Combinations
          2. Finding Associations in Data
            1. Structuring association rules
            2. Getting ready
            3. Shopping for associations
            4. Refining results
            5. Understanding the metrics
        3. Chapter 18: Expanding Your Horizons
          1. Squeezing More Out of What You Have
            1. Mastering your data-mining application
            2. Fine-tuning your settings
            3. Analyzing your analysis
            4. Using meta-models (ensemble models)
          2. Widening Your Range
            1. Tackling text
            2. Detecting sequences
            3. Working with time series
          3. Taking on Big Data
            1. Coming to terms with Big Data
            2. Conducting predictive analytics with Big Data
          4. Blending Methods for Best Results
      7. Part VI: The Part of Tens
        1. Chapter 19: Ten Great Resources for Data Miners
          1. Society of Data Miners
          2. KDnuggets
          3. All Analytics
          4. The New York Times
          5. Forbes
          6. SmartData Collective
          7. CRISP-DM Process Model
          8. Nate Silver
          9. Meta’s Analytics Articles page
          10. First Internet Gallery of Statistics Jokes
        2. Chapter 20: Ten Useful Kinds of Analysis That Complement Data Mining
          1. Business Analysis
          2. Conjoint Analysis
          3. Design of Experiments
          4. Marketing Mix Modeling
          5. Operations Research
          6. Reliability Analysis
          7. Statistical Process Control
          8. Social Network Analysis
          9. Structural Equation Modeling
          10. Web Analytics
      8. Appendix A: Glossary
      9. Appendix B: Data-Mining Software Sources
        1. Moving Forward
        2. Discovering What’s Available
        3. Software Suppliers
          1. Alteryx
          2. Angoss
          3. IBM
          4. Knime
          5. KXEN, an SAP Company
          6. Megaputer
          7. Oracle
          8. R Foundation
          9. RapidMiner
          10. Revolution Analytics
          11. Salford Systems
          12. SAS Institute
          13. Statsoft
          14. Tableau Software
          15. Teradata
          16. University of Ljubljana
          17. University of Waikato
          18. Wolfram
      10. Appendix C: Major Data Vendors
        1. Acxiom
        2. Corelogic
        3. Datalogix
        4. DataSift
        5. eBureau
        6. Equifax
        7. Experian
        8. Gnip
        9. ID Analytics
        10. Intelius
        11. IRI
        12. Nielsen
        13. PeekYou
        14. Rapleaf
        15. Recorded Future
        16. TransUnion
      11. Appendix D: Sources and Citations
        1. Data Sources
          1. Chapter 2. A Day in Your Life as a Data Miner
          2. Chapter 12. Getting Familiar with Your Data
          3. Chapter 13. Dealing in Graphic Detail
          4. Chapter 14. Showing Your Data Who’s Boss
          5. Chapter 15. Your Exciting Career in Modeling
          6. Chapter 16. Data Mining Using Classic Statistical Methods
          7. Chapter 17. Mining Data for Clues
        2. Other Sources
      12. About the Author
      13. Cheat Sheet