You are previewing Mastering SQL Server 2014 Data Mining.
O'Reilly logo
Mastering SQL Server 2014 Data Mining

Book Description

Master selecting, applying, and deploying data mining models to build powerful predictive analysis frameworks

In Detail

Whether you are new to data mining or are a seasoned expert, this book will provide you with the skills you need to successfully create, customize, and work with Microsoft Data Mining Suite. Starting with the basics, this book will cover how to clean the data, design the problem, and choose a data mining model that will give you the most accurate prediction.

Next, you will be taken through the various classification models such as the decision tree data model, neural network model, as well as Naïve Bayes model. Following this, you'll learn about the clustering and association algorithms, along with the sequencing and regression algorithms, and understand the data mining expressions associated with each algorithm. With ample screenshots that offer a step-by-step account of how to build a data mining solution, this book will ensure your success with this cutting-edge data mining system.

What You Will Learn

  • Get an overview of the data mining life cycle
  • Understand the intricacies of SQL Server BI Suite with the help of a practical example
  • Collate data from diverse data sources and build a data warehouse
  • Gain in-depth knowledge about the various data mining models such as classification, segmentation, association, and more
  • Perform data mining using Big Data and Excel add-ins
  • Work on real-world data and gain insights into it using various data mining algorithms
  • Fine tune data mining models
  • Troubleshoot problems encountered during data mining activities performed in this book
  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. Mastering SQL Server 2014 Data Mining
      1. Table of Contents
      2. Mastering SQL Server 2014 Data Mining
      3. Credits
      4. About the Authors
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
          3. Instant updates on new Packt books
      7. Preface
        1. Data mining – what's the need?
          1. Information extraction
          2. Information extraction methodologies
          3. Data analysis
          4. Online analytical processing
          5. Data mining
          6. Data mining for gaming
            1. Data mining for business
            2. Data mining for spatial data
            3. Data mining for sensor data
            4. Looking for correlation
        2. What this book covers
        3. What you need for this book
        4. Who this book is for
        5. Conventions
        6. Reader feedback
        7. Customer support
          1. Downloading the example code
          2. Downloading the color images of this book
          3. Errata
          4. Piracy
          5. Questions
      8. 1. Identifying, Staging, and Understanding Data
        1. Data mining life cycle
        2. Staging data
          1. Extract, transform, and load
          2. Data warehouse
            1. Measures and dimensions
            2. Schema
            3. Data mart
          3. Refreshing data
        3. Understanding and cleansing data
        4. Summary
      9. 2. Data Model Preparation and Deployment
        1. Preparing data models
          1. Cross-Industry Standard Process for Data Mining
        2. Validating data models
          1. Preparing the data mining models
        3. Deploying data models
          1. Updating the models
        4. Summary
      10. 3. Tools of the Trade
        1. SQL Server BI Suite
          1. SQL Server Engine
          2. SQL Server Data Tools
          3. SQL Server Data Quality Services
          4. SQL Server Integration Services
          5. SQL Server Analysis Services
          6. SQL Server Reporting Services
        2. References
        3. Summary
      11. 4. Preparing the Data
        1. Listing of popular databases
          1. Migrating data from popular databases to a staging database
          2. Migrating data from IBM DB2
          3. Building a data warehouse
          4. Automating data ingestion
        2. Summary
      12. 5. Classification Models
        1. Input, output, and predicted columns
        2. The feature selection
        3. The Microsoft Decision Tree algorithm
          1. Data Mining Extensions for the Decision Tree algorithm
        4. The Microsoft Neural Network algorithm
          1. Data Mining Extensions for the Neural Network algorithm
        5. The Microsoft Naïve Bayes algorithm
          1. Data Mining Extensions for the Naïve Bayes algorithm
        6. Summary
      13. 6. Segmentation and Association Models
        1. The Microsoft Clustering algorithm
          1. Data Mining Extensions for the Microsoft Clustering models
        2. The Microsoft Association algorithm
          1. Data Mining Extensions for the Microsoft Association models
        3. Summary
      14. 7. Sequence and Regression Models
        1. The Microsoft Sequence Clustering algorithm
          1. Data Mining Extensions for the Microsoft Sequence Clustering models
        2. The Microsoft Time Series algorithm
        3. Summary
      15. 8. Data Mining Using Excel and Big Data
        1. Data mining using Microsoft Excel
        2. Data mining using HDInsight and Microsoft Azure Machine Learning
          1. Microsoft Azure
          2. Microsoft HDInsight
          3. HDInsight PowerShell
          4. Microsoft Azure Machine Learning
        3. Summary
      16. 9. Tuning the Models
        1. Getting the real-world data
          1. Building the decision tree model
          2. Tuning the model
        2. Adding a clustering model to the data mining structure
        3. Adding the Neural Network model to the data mining structure
          1. Comparing the predictions of different models
        4. Summary
      17. 10. Troubleshooting
        1. A fraction of rows get transferred into a SQL table
        2. Error during changing of the data type of the table
        3. Troubleshooting the data mining structure performance
          1. The Decision Tree algorithm
          2. The Naïve Bayes algorithm
          3. The Microsoft Clustering algorithm
          4. The Microsoft Association algorithm
          5. The Microsoft Time Series algorithm
        4. Error during the deployment of a model
        5. Summary
      18. Index