Baseball Hacks isn't your typical baseball book--it's a book about how to watch, research, and understand baseball. It's an instruction manual for the free baseball databases. It's a cookbook for baseball research. Every part of this book is designed to teach baseball fans how to do something. In short, it's a how-to book--one that will increase your enjoyment and knowledge of the game.
So much of the way baseball is played today hinges
upon interpreting statistical data. Players are acquired
based on their performance in statistical categories that
ownership deems most important. Managers make in-game
decisions based not on instincts, but on probability - how a
particular batter might fare against left-handed
pitching, for instance.
The goal of this unique book is to show fans all the baseball-related stuff that they can do for free (or close to free). Just as open source projects have made great software freely available, collaborative projects such as Retrosheet and Baseball DataBank have made great data freely available. You can use these data sources to research your favorite players, win your fantasy league, or appreciate the game of baseball even more than you do now.
Baseball Hacks shows how easy it is to get data, process it, and use it to truly understand baseball. The book lists a number of sources for current and historical baseball data, and explains how to load it into a database for analysis. It then introduces several powerful statistical tools for understanding data and forecasting results.
For the uninitiated baseball fan, author Joseph Adler
walks readers through the core statistical categories for
hitters (batting average, on-base percentage, etc.),
pitchers (earned run average, strikeout-to-walk ratio,
etc.), and fielders (putouts, errors, etc.). He then
extrapolates upon these numbers to examine more advanced
data groups like career averages, team stats,
season-by-season comparisons, and more. Whether you're a
mathematician, scientist, or season-ticket holder to your
favorite team, Baseball Hacks is sure
to have something for you.
Advance praise for Baseball Hacks:
"Baseball Hacks is the best book ever
written for understanding and practicing baseball analytics.
A must-read for baseball professionals and enthusiasts
-- Ari Kaplan, database consultant to the Montreal Expos, San Diego Padres, and Baltimore Orioles
"The game was born in the 19th century, but the
passion for its analysis continues to grow into the 21st.
In Baseball Hacks, Joe Adler not only
the latest data-mining technologies have useful application to the study of baseball statistics, he also teaches the reader how to do the analysis himself, arming the dedicated baseball fan with tools to take his understanding of the game to a higher level."
-- Mark E. Johnson, Ph.D., Founder, SportMetrika, Inc. and Baseball Analyst for the 2004 St. Louis Cardinals
Table of Contents
1. Basics of Baseball
- Hacks 1–7: Introduction
- Score a Baseball Game
Make a Box Score from a Score Sheet
- The Official Rules for Scoring
- Calculating a Box Score from a Score Sheet
- Hacking the Hack
- Keep Score, Project Scoresheet–Style
Follow Pitches During a Game
- Following the Pitching Strategy
- Identifying Pitches
- Follow the Game Online
- Add Baseball Searches to Firefox
- Find Images of Stadiums
2. Baseball Games from Past Years
- Hacks 8–23: Introduction
- Get and Install MySQL
- Get an Access Database of Player and Team Statistics
- Get a MySQL Database of Player and Team Statistics
- Make Your Own Stats Book
- Get Perl
- Learn Perl
- Get Historical Play-by-Play Data
- Make Box Scores or Database Tables from Play-by-Play Data with Retrosheet Tools
- Use SQL to Explore Game Data
- Use Microsoft Access to Run SQL Queries
- Get a GUI for MySQL
- Move Data from a Database to Excel
- Load Baseball Data into MySQL
- Load Retrosheet Game Logs
- Make a Historical Play-by-Play Database
- Use Regular Expressions to Identify Events
3. Stats from the Current Season
- Hacks 24–29: Introduction
- Use Microsoft Excel Web Queries to Get Stats
- Spider Baseball Sites for Data
- Discover How Live Score Applications Work
- Keep Your Stats Database Up-to-Date
- Get Recent Play-by-Play Data
- Find Data on Hit Locations
4. Visualize Baseball Statistics
- Hacks 30–39: Introduction
- Plot Histograms in Excel
- Get R and R Packages
- Analyze Baseball with R
- Access Databases Directly from Excel or R
- Load Text Files into R
- Compare Teams and Players with Lattices
- Compare Teams Using Chernoff Faces
- Plot Spray Charts
- Chart Team Stats in Real Time
- Slice and Dice Teams with Cubes
- Hacks 40–59: Introduction
- Measure Batting with Batting Average
- Measure Batting with On-Base Percentage
- Measure Batting with SLG
- Measure Batting with OPS
- Measure Power with ISO
- Measure Batting with Runs Created
- Measure Batting with Linear Weights
- Measure Pitching with ERA
- Measure Pitching with WHIP
- Measure Pitching with Linear Weights
- Measure Defense with Defensive Efficiency
- Measure Pitching with DIPS
- Measure Base Running Through EqBR
- Measure Fielding with Fielding Percentage
- Measure Fielding with Range Factor
Measure Fielding with Linear Weights
- The Formula
- Calculating Fielding Runs
- Sample Code
- Summary statistics.
- See Also
Measure Park Effects
- Sample Code
- Using Park Factors
- Hacking the Hack
- See Also
- Calculate Fan Save Value
- Calculate Save Value
- Calculate Holds and Decent Holds for Relief Pitchers
6. Sabermetric Thinking
- Hacks 60–71: Introduction
- Calculate Expected Runs
- Calculate an Expected Hits Matrix
- Look for Evidence of Platoon Effects
- Significant Number of At Bats
- Find “Clutch” Players
- Calculate Expected Number of Wins
- Measure Hits by Pitch Count
- OBP, SLG, and Scoring Runs
- Measure Skill Versus Luck
- Odds of the Best Team Winning the World Series
Top 10 Bargain Outfielders
- The Code
Running the Hack
- Identify common attributes.
- Look at correlations.
- Identify possible explanations for correlations.
- Assign attribute scores.
- Group players based on similarity.
- Attach group membership to data set.
- Transform salary variable.
- Create linear regression model.
- Compare predicted versus actual salaries.
- Identify most-underpaid players.
- Identify most-overpaid players.
- Hacking the Hack
- Fitting Game Scores to a Strength Model
7. The Bullpen
- Hacks 72–75: Introduction
- Start or Join a Fantasy League
- Draft Your Fantasy Team
- Make a Scoreboard Widget
- Analyze Other Sports
- A. Where to Learn More Stuff
- B. Abbreviations