Load Retrosheet Game Logs

Load Retrosheet Game Log files to get a summary about every game played.

Sometimes you don’t need information about every play, just how a team played. This hack shows you how to load the game files into MySQL for querying. For each game, the game log contains up to 161 different variables covering where the game was played, when the game was played, what teams played the game, what players started for each team, how each team scored, a variety of offensive and defensive statistics, the umpires for the game, and miscellaneous extra data. These files don’t contain play-by-play data or stats on individual players.

Game log files are available for download from the Retrosheet web site, at http://www.retrosheet.org/gamelogs/index.html. The files are available for individual years or for multiple years as zip files. A second way to get these files is to generate them yourself from Retrosheet event files, using the BEVENT tool [Hack #15] .

You can use this data for any purpose where wins and losses are important, but individual events aren’t. One example of this is in looking at park effects, as discussed in “Measure Park Effects” [Hack #56] , where we need to know wins and losses between teams in different ballparks.

The Code

You can probably type in all the variables, but it’s a little tedious, so I’ll save you the effort. (Check the book’s web site to download the SQL code on this page.) I created a file with column names, called game_log_header.csv. The contents ...

Get Baseball Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.