O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Analysis and Exploration with Pandas

Video Description

Get idiomatic solutions to common data problems while working on real-world datasets and get surprising insights from the pandas library

About This Video

  • Enhance your data exploration and machine learning skills by gaining surprising insights from pandas and using expert tips and tricks.
  • Solve most complex scientific computing problems with ease using the power of pandas
  • Leverage fast, robust data structures in pandas to gain useful insights from your data

In Detail

Are you looking for a gigantic boost in your productivity? Are you searching for some interesting and fun tricks to solve your data problems? If so, then this course is indeed a perfect choice for you. This course provides you with unique, idiomatic, and amazing solutions for both fundamental and advanced data manipulation tasks with pandas.

Some solutions focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. A few others will delve into a particular dataset, and let you uncover new and unexpected insights along the way.

The pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This course guides you, as if you were looking over the shoulder of an expert, through practical situations that you are highly likely to encounter. Many advanced solutions combine several different features across the pandas library to generate results.

The code bundle for the video course is available at - https://github.com/PacktPublishing/Data-Analysis-and-Exploration-with-Pandas

Table of Contents

  1. Chapter 1 : Pandas Foundations
    1. The Course Overview 00:04:29
    2. Dissecting the Anatomy of a DataFrame 00:04:32
    3. Accessing the Main DataFrame Components 00:03:13
    4. Understanding Data Types 00:03:17
    5. Selecting a Single Column of Data as a Series 00:02:54
    6. Calling Series Methods 00:05:51
    7. Working with Operators on a Series 00:04:47
    8. Chaining Series Methods Together 00:03:49
    9. Making the Index Meaningful 00:02:18
    10. Renaming Row and Column Names 00:02:31
    11. Creating and Deleting Columns 00:05:06
  2. Chapter 2 : Essential DataFrame Operations
    1. Selecting Multiple DataFrame Columns 00:03:42
    2. Selecting Columns with Methods 00:03:42
    3. Ordering Column Names Sensibly 00:03:18
    4. Operating on the Entire DataFrame 00:02:50
    5. Chaining DataFrame Methods Together 00:02:35
    6. Working with Operators on a DataFrame 00:04:46
    7. Comparing Missing Values 00:03:09
    8. Transposing the Direction of a DataFrame Operation 00:02:34
    9. Determining College Campus Diversity 00:05:24
  3. Chapter 3 : Beginning Data Analysis
    1. Developing a Data Analysis Routine 00:05:08
    2. Reducing Memory by Changing Data Types 00:04:49
    3. Selecting the Smallest of the Largest 00:01:56
    4. Selecting the Largest of Each Group by Sorting 00:02:22
    5. Replicating nlargest with sort_values 00:02:27
  4. Chapter 4 : Selecting Subsets of Data
    1. Selecting Series Data 00:05:00
    2. Selecting DataFrame Rows 00:02:33
    3. Selecting DataFrame Rows and Columns Simultaneously 00:03:18
    4. Selecting Data with Both Integers and Labels 00:03:13
    5. Speeding Up Scalar Selection 00:03:11
    6. Slicing Rows Lazily 00:03:08
    7. Slicing Lexicographically 00:02:34
  5. Chapter 5 : Boolean Indexing
    1. Calculating Boolean Statistics 00:05:36
    2. Constructing Multiple Boolean Conditions 00:02:57
    3. Filtering with Boolean Indexing 00:02:33
    4. Replicating Boolean Indexing with Index Selection 00:02:29
    5. Selecting with Unique and Sorted Indexes 00:03:02
    6. Gaining Perspective on Stock Prices 00:03:15
    7. Translating SQL WHERE Clauses 00:03:07
    8. Determining the Normality of Stock Market Returns 00:03:54
    9. Improving Readability of Boolean Indexing with the Query Method 00:02:30
    10. Preserving Series with the WHERE Method 00:03:54
    11. Masking DataFrame Rows 00:04:03
    12. Selecting with Booleans, Integer Location, and Labels 00:04:38
  6. Chapter 6 : Index Alignment
    1. Examining the Index Object 00:02:46
    2. Producing Cartesian Products 00:04:26
    3. Exploding Indexes 00:02:55
    4. Filling Values with Unequal Indexes 00:03:18
    5. Appending Columns from Different DataFrames 00:02:41
    6. Highlighting the Maximum Value from Each Column 00:04:06
    7. Replicating idxmax with Method Chaining 00:04:32
    8. Finding the Most Common Maximum 00:02:32
  7. Chapter 7 : Grouping for Aggregation, Filtration, and Transformation
    1. Defining an Aggregation 00:06:00
    2. Grouping and Aggregating with Multiple Columns and Functions 00:02:25
    3. Removing the MultiIndex After Grouping 00:02:52
    4. Customizing an Aggregation Function 00:03:22
    5. Customizing Aggregating Functions with *args and **kwargs 00:02:07
    6. Examining the groupby Object 00:02:57
    7. Filtering for States with a Minority Majority 00:03:23
    8. Transforming through a Weight Loss Bet 00:05:08
    9. Calculating Weighted Mean SAT Scores Per State with Apply 00:04:50
    10. Grouping By Continuous Variables 00:03:16
    11. Counting the Total Number of Flights Between Cities 00:03:41
    12. Finding the Longest Streak of On-Time Flights 00:06:42
  8. Chapter 8 : Restructuring Data into a Tidy Form
    1. Tidying Variable Values as Column Names with Stack 00:05:06
    2. Tidying Variable Values as Column Names with Melt 00:02:48
    3. Stacking Multiple Groups of Variables Simultaneously 00:04:28
    4. Inverting Stacked Data 00:03:12
    5. Unstacking After a groupby Aggregation 00:02:40
    6. Replicating pivot_table with a groupby Aggregation 00:03:02
    7. Renaming Axis Levels for Easy Reshaping 00:03:38
    8. Tidying When Multiple Variables are Stored as Column Names 00:03:16
    9. Tidying When Multiple Variables are Stored as Column Values 00:02:53
    10. Tidying When Two or More Values are Stored in the Same Cell 00:02:45
    11. Tidying When Variables are Stored in Column Names and Values 00:01:42
    12. Tidying When Multiple Observational Units are Stored in the Same Table 00:05:12
  9. Chapter 9 : Combining Pandas Objects
    1. Appending New Rows to DataFrames 00:07:50
    2. Concatenating Multiple DataFrames Together 00:03:15
    3. Comparing President Trump's and Obama's Approval Ratings 00:15:45
    4. Understanding the Differences Between concat, join, and merge 00:09:08
    5. Connecting to SQL Databases 00:05:46