You are previewing Python Business Intelligence Cookbook.
O'Reilly logo
Python Business Intelligence Cookbook

Book Description

Leverage the computational power of Python with more than 60 recipes that arm you with the required skills to make informed business decisions

About This Book

  • Want to minimize risk and optimize profits of your business? Learn to create efficient analytical reports with ease using this highly practical, easy-to-follow guide
  • Learn to apply Python for business intelligence tasks—preparing, exploring, analyzing, visualizing and reporting—in order to make more informed business decisions using data at hand
  • Learn to explore and analyze business data, and build business intelligence dashboards with the help of various insightful recipes

Who This Book Is For

This book is intended for data analysts, managers, and executives with a basic knowledge of Python, who now want to use Python for their BI tasks. If you have a good knowledge and understanding of BI applications and have a “working” system in place, this book will enhance your toolbox.

What You Will Learn

  • Install Anaconda, MongoDB, and everything you need to get started with your data analysis
  • Prepare data for analysis by querying cleaning and standardizing data
  • Explore your data by creating a Pandas data frame from MongoDB
  • Gain powerful insights, both statistical and predictive, to make informed business decisions
  • Visualize your data by building dashboards and generating reports
  • Create a complete data processing and business intelligence system

In Detail

The amount of data produced by businesses and devices is going nowhere but up. In this scenario, the major advantage of Python is that it's a general-purpose language and gives you a lot of flexibility in data structures. Python is an excellent tool for more specialized analysis tasks, and is powered with related libraries to process data streams, to visualize datasets, and to carry out scientific calculations. Using Python for business intelligence (BI) can help you solve tricky problems in one go.

Rather than spending day after day scouring Internet forums for “how-to” information, here you’ll find more than 60 recipes that take you through the entire process of creating actionable intelligence from your raw data, no matter what shape or form it’s in. Within the first 30 minutes of opening this book, you’ll learn how to use the latest in Python and NoSQL databases to glean insights from data just waiting to be exploited.

We’ll begin with a quick-fire introduction to Python for BI and show you what problems Python solves. From there, we move on to working with a predefined data set to extract data as per business requirements, using the Pandas library and MongoDB as our storage engine.

Next, we will analyze data and perform transformations for BI with Python. Through this, you will gather insightful data that will help you make informed decisions for your business. The final part of the book will show you the most important task of BI—visualizing data by building stunning dashboards using Matplotlib, PyTables, and iPython Notebook.

Style and approach

This is a step-by-step guide to help you prepare, explore, analyze and report data, written in a conversational tone to make it easy to grasp. Whether you’re new to BI or are looking for a better way to work, you’ll find the knowledge and skills here to get your job done efficiently.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Python Business Intelligence Cookbook
    1. Table of Contents
    2. Python Business Intelligence Cookbook
    3. Credits
    4. About the Author
    5. About the Reviewer
    6. www.PacktPub.com
      1. Support files, eBooks, discount offers, and more
        1. Why subscribe?
        2. Free access for Packt account holders
    7. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Sections
        1. Getting ready
        2. How to do it…
        3. How it works…
        4. There's more…
        5. See also
      5. Conventions
      6. Reader feedback
      7. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    8. 1. Getting Set Up to Gain Business Intelligence
      1. Introduction
      2. Installing Anaconda
        1. Getting ready
        2. How to do it…
          1. Mac OS X 10.10.4
          2. Windows 8.1
          3. Linux Ubuntu server 14.04.2 LTS
        3. How it works…
      3. Learn about the Python libraries we will be using
      4. Installing, configuring, and running MongoDB
        1. Getting ready
        2. How to do it…
          1. Mac OS X
          2. Windows
          3. Linux
        3. How it works…
      5. Installing Rodeo
        1. Getting ready
        2. How to do it…
        3. How it works…
      6. Starting Rodeo
        1. Getting ready
        2. How to do it…
      7. Installing Robomongo
        1. Getting ready
        2. How to do it…
          1. Mac OS X
          2. Windows
      8. Using Robomongo to query MongoDB
        1. Getting ready
        2. How to do it…
      9. Downloading the UK Road Safety Data dataset
        1. How to do it…
        2. How it works…
          1. Why we are using this dataset
    9. 2. Making Your Data All It Can Be
      1. Importing a CSV file into MongoDB
        1. Getting ready
        2. How to do it…
        3. How it works…
        4. There's more…
      2. Importing an Excel file into MongoDB
        1. Getting ready
        2. How to do it…
        3. How it works…
      3. Importing a JSON file into MongoDB
        1. Getting ready
        2. How to do it…
      4. Importing a plain text file into MongoDB
        1. How to do it…
        2. How it works…
      5. Retrieving a single record using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      6. Retrieving multiple records using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      7. Inserting a single record using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      8. Inserting multiple records using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      9. Updating a single record using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      10. Updating multiple records using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      11. Deleting a single record using pymongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      12. Deleting multiple records using PyMongo
        1. Getting ready
        2. How to do it…
        3. How it works…
      13. Importing a CSV file into a Pandas DataFrame
        1. Getting ready
        2. How to do it…
        3. How it works…
        4. There's more…
      14. Renaming column headers in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      15. Filling in missing values in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      16. Removing punctuation in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      17. Removing whitespace in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      18. Removing any string from within a string in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      19. Merging two datasets in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      20. Titlecasing anything
        1. Getting ready
        2. How to do it…
        3. How it works…
      21. Uppercasing a column in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      22. Updating values in place in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      23. Standardizing a Social Security number in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      24. Standardizing dates in Pandas
        1. Getting ready
        2. How to do it…
        3. How it works…
      25. Converting categories to numbers in Pandas for a speed boost
        1. Getting ready
        2. How to do it…
        3. How it works…
    10. 3. Learning What Your Data Truly Holds
      1. Creating a Pandas DataFrame from a MongoDB query
        1. Getting ready
        2. How to do it…
        3. How it works…
      2. Creating a Pandas DataFrame from a CSV file
        1. How to do it…
        2. How it works…
      3. Creating a Pandas DataFrame from an Excel file
        1. How to do it…
        2. How it works…
      4. Creating a Pandas DataFrame from a JSON file
        1. How to do it…
        2. How it works…
      5. Creating a data quality report
        1. Getting ready
        2. How to do it…
        3. How it works…
      6. Generating summary statistics for the entire dataset
        1. How to do it…
        2. How it works…
      7. Generating summary statistics for object type columns
        1. How to do it…
        2. How it works…
      8. Getting the mode of the entire dataset
        1. How to do it…
        2. How it works…
      9. Generating summary statistics for a single column
        1. How to do it…
        2. How it works…
      10. Getting a count of unique values for a single column
        1. How to do it…
        2. How it works…
          1. Additional Arguments
      11. Getting the minimum and maximum values of a single column
        1. How to do it…
        2. How it works…
      12. Generating quantiles for a single column
        1. How to do it…
        2. How it works…
      13. Getting the mean, median, mode, and range for a single column
        1. How to do it…
        2. How it works…
      14. Generating a frequency table for a single column by date
        1. Getting ready
        2. How to do it…
        3. How it works…
      15. Generating a frequency table of two variables
        1. Getting ready
        2. How to do it…
        3. How it works…
      16. Creating a histogram for a column
        1. Getting ready
        2. How to do it…
        3. How it works…
      17. Plotting the data as a probability distribution
        1. How to do it…
        2. How it works…
      18. Plotting a cumulative distribution function
        1. How to do it…
        2. How it works…
      19. Showing the histogram as a stepped line
        1. How to do it…
        2. How it works…
      20. Plotting two sets of values in a probability distribution
        1. How to do it…
        2. How it works…
      21. Creating a customized box plot with whiskers
        1. How to do it…
        2. How it works…
      22. Creating a basic bar chart for a single column over time
        1. How to do it…
        2. How it works…
    11. 4. Performing Data Analysis for Non Data Analysts
      1. Performing a distribution analysis
        1. How to do it…
        2. How it works…
      2. Performing categorical variable analysis
        1. How to do it…
        2. How it works…
      3. Performing a linear regression
        1. How to do it…
        2. How it works…
      4. Performing a time-series analysis
        1. How to do it…
        2. How it works…
      5. Performing outlier detection
        1. How to do it…
        2. How it works…
      6. Creating a predictive model using logistic regression
        1. How to do it…
        2. How it works…
      7. Creating a predictive model using a random forest
        1. How to do it…
        2. How it works…
      8. Creating a predictive model using Support Vector Machines
        1. How to do it…
        2. How it works…
      9. Saving a predictive model for production use
        1. Getting Ready
        2. How to do it…
        3. How it works…
    12. 5. Building a Business Intelligence Dashboard Quickly
      1. Creating reports in Excel directly from a Pandas DataFrame
        1. How to do it…
        2. How it works…
      2. Creating customizable Excel reports using XlsxWriter
        1. How to do it…
        2. How it works…
      3. Building a shareable dashboard using IPython Notebook and matplotlib
        1. Getting Set Up…
        2. How to do it…
        3. How it works…
      4. Exporting an IPython Notebook Dashboard to HTML
        1. Getting Ready…
        2. How to do it…
        3. How it works…
        4. See Also…
      5. Exporting an IPython Notebook Dashboard to PDF
        1. Getting Ready…
        2. How to do it...
          1. Method one…
          2. Method 2…
      6. Exporting an IPython Notebook Dashboard to an HTML slideshow
        1. How to do it…
        2. How it works…
      7. Building your First Flask application in 10 minutes or less
        1. Getting Set Up…
        2. How to do it…
        3. How it works…
          1. See Also..
      8. Creating and saving your plots for your Flask BI dashboard
        1. How to do it…
        2. How it works…
      9. Building a business intelligence dashboard in Flask
        1. How to do it…
        2. How it works…
    13. Index