You are previewing Data Visualization with Python and JavaScript.
O'Reilly logo
Data Visualization with Python and JavaScript

Book Description

Learn how to turn raw data into rich, interactive web visualizations with the powerful combination of Python and JavaScript. With this hands-on guide, author Kyran Dale teaches you how build a basic dataviz toolchain with best-of-breed Python and JavaScript libraries—including Scrapy, Matplotlib, Pandas, Flask, and D3—for crafting engaging, browser-based visualizations.

Table of Contents

  1. Preface
    1. Conventions Used in This Book
    2. Using Code Examples
    3. Safari® Books Online
    4. How to Contact Us
    5. Acknowledgements
  2. Introduction
    1. Who This Book is For
      1. Minimal requirements to use the book
    2. Why Python and JavaScript?
      1. Why not Python on the browser?
      2. Why Python for data-processing
      3. Python’s getting better all the time
    3. What You’ll Learn
      1. The Choice of Libraries
      2. Preliminaries
    4. The Dataviz Toolchain
      1. 1. Scraping data with Scrapy
      2. 2. Cleaning data with Pandas
      3. 3. Exploring data with Pandas and Matplotlib
      4. 4. Delivering your data with Flask
      5. 5. Transforming the data into interactive visualisations with D3
      6. Smaller Libraries
    5. A Little Bit of Context
    6. Summary
    7. Recommended Books
  3. 1. A Development Setup
    1. The Accompanying Code
    2. Python
      1. Anaconda
      2. Checking the Anaconda install
      3. Installing extra libs
      4. Virtual Environments
    3. JavaScript
      1. Content Delivery Networks (CDNs)
      2. Installing libraries locally
    4. Databases
      1. Installing MongoDB
    5. Integrated Development Environments
    6. Summary
  4. I. A Basic Toolkit
  5. 2. A Language Learning Bridge Between Python and JavaScript
    1. Similarities and Differences
    2. Interacting with the Code
      1. Python
      2. JavaScript
    3. Basic Bridge Work
      1. Style guidelines, PEP 8 and ‘use strict’
      2. Camel-case vs underscore
      3. Importing modules, including scripts
      4. Keeping your namespaces clean
      5. Outputting ‘Hello World’
      6. Simple data-processing
      7. String construction
      8. Significant whitespace vs curly brackets
      9. Comments and doc-strings
      10. Declaring variables, var
      11. Strings and numbers
      12. Booleans
      13. Data containers: dicts, objects, lists, arrays
      14. Functions
      15. Iterating: for loops and functional alternatives
      16. Conditionals: if, else, elif, switch
      17. File input and output
      18. Classes and prototypes
    4. Differences in Practice
      1. Method chaining
      2. Enumerating a list
      3. Tuple unpacking
      4. Collections
      5. Underscore
      6. Functional array methods and list comprehensions
      7. Map, reduce and filter with Python’s lambdas
      8. JavaScript closures and the module-pattern
      9. This is that
    5. A Cheatsheet
    6. Summary
  6. 3. Reading and Writing Data with Python
    1. Easy Does It
    2. Passing Data Around
    3. Working with System Files
    4. CSV, TSV and Row-column Data-formats
    5. JSON
      1. Dealing with dates and times
    6. SQL
      1. Creating the database engine
      2. Defining the database tables
      3. Adding instances with a session
      4. Querying the database
      5. Easier SQL with Dataset
    7. MongoDB
    8. Dealing with Dates, Times and Complex Data
    9. Summary
  7. 4. Webdev 101
    1. The Big Picture
    2. Single-page Apps
    3. Tooling Up
      1. The myth of IDEs, frameworks and tools
      2. Your text editing work-horse
      3. Browser with development tools
      4. Terminal or command-prompt
    4. Building a Web-page
      1. Serving Pages with HTTP
      2. The DOM
      3. The HTML skeleton
      4. Marking-up content
      5. CSS
      6. JavaScript
      7. Data
    5. Chrome’s Developer Tools
      1. The Elements Tab
      2. The Sources Tab
      3. Other Tools
    6. A Basic Page with Placeholders
      1. Filling the placeholders with content
    7. Scalable Vector Graphics (SVG)
      1. The svg element
      2. The g element
      3. Circles
      4. Applying CSS-styles
      5. Lines, rectangles, polygons
      6. Text
      7. Paths
      8. Scaling and rotating
      9. Working with groups
      10. Layering and transparency
      11. JavaScripted SVG
    8. Summary
  8. II. Getting Your Data
  9. 5. Getting Data off the Web with Python
    1. Getting Web-data with the requests library
    2. Getting Data-files with requests
    3. Using Python to Consume Data from a Web-API
      1. Using a RESTful Web-API with requests
      2. Getting some country data for the Nobel-viz
    4. Using Libraries to access Web-APIs
      1. Using Google-spreadsheets
      2. Using the Twitter API with Tweepy
    5. Scraping Data
      1. Why we need to scrape
      2. BeautifulSoup and lxml
      3. A First Scraping Foray
    6. Getting the soup
    7. Selecting tags
      1. Crafting some selection patterns
      2. Caching the web-pages
      3. Scraping the winners’ nationalities
    8. Summary
  10. 6. Heavyweight Scraping with Scrapy
    1. Setting up Scrapy
    2. Establishing the Targets
    3. Targeting HTML with Xpaths
      1. Testing xpaths with the Scrapy shell
      2. Selecting with relative Xpaths
    4. A First Scrapy Spider
    5. Scraping the Individual Biography Pages
    6. Chaining Requests and Yielding Data
      1. Caching our pages
      2. Yielding requests
    7. Scrapy Pipelines
    8. Scraping Text and Images with a Pipeline
  11. III. Cleaning and Exploring your Data with Pandas
  12. 7. Introduction to NumPy
    1. The NumPy Array
      1. Creating Arrays
      2. Array indexing, slicing
      3. A Few Basic Operations
    2. Creating Array Functions
      1. Calculating a Moving Average
    3. Summary
  13. 8. Introduction to Pandas
    1. Why Pandas is Tailor-made for Dataviz
    2. Why Pandas was Developed
    3. Heterogeneous Data and Categorising Measurements
    4. The Data Frame
      1. Indices
      2. Rows and columns
      3. Selecting groups
    5. Creating and Saving DataFrames
      1. JSON
      2. CSV
      3. Excel files
      4. SQL
      5. MongoDB
    6. Series into DataFrames
    7. Panels
    8. Summary
  14. 9. Cleaning Data With Pandas
    1. Coming Clean about Dirty Data
    2. Inspecting the Data
    3. Indices and Pandas Data Selection
      1. Selecting multiple rows
    4. Cleaning the Data
      1. Finding mixed types
      2. Replacing strings
      3. Removing rows
      4. Finding duplicates
      5. Sorting data
      6. Removing duplicates
      7. Dealing with times and dates
    5. The full clean_data function
    6. Saving the Cleaned Dataset
      1. Merging data frames
    7. Summary
  15. 10. Visualising Data With Matplotlib
    1. Pyplot and Object Oriented Matplotlib
    2. Starting an Interactive Session
    3. Interactive Plotting with Pyplot’s Global State
      1. Configuring Matplotlib
      2. Setting the figure’s size
      3. Points not Pixels
      4. Labels and legends
      5. Titles, axes-labels etc.
      6. Saving your charts
    4. Figures and Object Oriented Matplotlib
      1. Axes and subplots
    5. Plot types
      1. Bar charts
      2. Scatter-plots
    6. Seaborn
      1. Facetgrids
      2. Pairgrids
    7. Summary
  16. 11. Exploring Data With Pandas
    1. Starting to explore
    2. Plotting with Pandas
    3. Gender disparities
      1. Unstacking groups
      2. Historical trends
    4. National trends
      1. Prize winners per-capita
      2. Prizes by category
      3. Historical trends in prize distribution
    5. Age and life-expectancy of winners
      1. Age at time of award
      2. Life expectancy of the winners
      3. Increasing life-expectancy over time
    6. The Nobel Diaspora
    7. Summary
  17. IV. Delivering the Data
  18. 12. Delivering the Data
    1. Serving the Data
      1. Organizing your Flask files
      2. Serving data with Flask
    2. Delivering Static Files
    3. Dynamic Data with Flask
      1. A Simple RESTful API with Flask
    4. Using Static or Dynamic delivery
    5. Summary
  19. 13. RESTful Data with Flask
    1. A RESTful, MongoDB API with EVE
      1. Using AJAX to Access the API
    2. Delivering Data to the Nobel Visualisation
    3. RESTful SQL with Flask-Restless
      1. Creating the API
      2. Adding CORS support
      3. Querying the API
    4. Summary
  20. V. Visualising your Data with D3
  21. 14. Imagining a Nobel Visualization
    1. Who Is It For?
    2. Choosing Our Visual Elements
    3. the Menu Bar
    4. The Prizes by Year
    5. A Map Showing Selected Nobel Countries
    6. A bar chart Showing Number of Winners by Country
    7. A List of the Selected Winners
      1. A Mini-biography Box with Picture
    8. The Complete Visualisation
    9. Summary
  22. 15. Building a Visualisation
    1. Preliminaries
      1. The core components
      2. Organizing your files
      3. Serving the Data
    2. The HTML Skeleton
    3. The CSS Styling
    4. The JavaScript Engine
      1. Importing the scripts
      2. Basic data flow
      3. The core code
      4. Initialising the Nobel-viz
      5. Ready to go
      6. Data driven updates
      7. Filtering the data with Crossfilter
    5. Running the Nobel-viz App
    6. Summary
  23. 16. Introducing D3 - the story of a bar chart
    1. Framing the Problem
    2. Working with Selections
    3. Adding DOM Elements
    4. Leveraging D3
    5. Measuring Up with D3’s Scales
      1. Quantitative scales
      2. Ordinal scales
    6. Unleashing the Power of D3 with Data-binding
    7. The enter method
    8. Accessing the bound data
    9. The Update Pattern
    10. Axes and Labels
    11. Transitions
    12. Summary
  24. 17. Visualising the individual prizes
    1. Building the framework
    2. The Scales
    3. The Axes
    4. The Category Labels
    5. Nesting the Data
    6. Adding the Winners with a Nested Data-join
    7. A Little Transitional Sparkle
    8. Summary
  25. 18. Mapping with D3
    1. Available Maps
    2. D3’s Mapping Data-formats
      1. GeoJSON
      2. TopoJSON
      3. Converting maps to TopoJSON
    3. D3 Geo, Projections and Paths
      1. Projections
      2. Paths
      3. Graticules
    4. Putting the Elements Together
    5. Updating the Map
    6. Adding Value Indicators
    7. Our Completed Map
    8. Building a Simple Tooltip
    9. Summary
  26. 19. Visualising the Individual Winners
    1. Building the List
    2. Building the bio box
    3. Summary
  27. 20. The Menu-bar
    1. Creating HTML Elements with D3
    2. Building the Menu-bar
      1. Building the category selector
      2. Adding the gender selector
      3. Adding the country selector
      4. Wiring up the metric radio button
    3. Summary
  28. 21. Conclusion
    1. A Recap
      1. Part I, “A Basic Toolkit”
      2. Part II, “Getting Your Data”
      3. Part III, “Cleaning and Exploring your Data with Pandas”
      4. Part IV, “Delivering the Data”
      5. Part V, “Visualising your Data with D3”
    2. Future Progress
      1. Visualising social media networks
      2. Interactive mapping with Leaflet and Folium
      3. Machine learning visualisations
    3. Final thoughts
  29. A. Moving from Development to Production
    1. The Starting Directory
    2. Configuration
      1. Configuring Flask
      2. Configuring the JavaScript app
    3. Authentication
    4. Testing Flask apps
    5. Testing JavaScript apps
    6. Deploying Flask apps
      1. Configuring Apache
    7. Logging and Error-handling
  30. Index