Chapter 3: Predicting Sports Winners with Decision Trees

More on pandas

The pandas library is a great package—anything you normally write to do data loading is probably already implemented in pandas. You can learn more about it from their tutorial at http://pandas.pydata.org/pandas-docs/stable/tutorials.html

There is also a great blog post written by Chris Moffitt that overviews common tasks people do in Excel and how to do them in pandas: http://pbpython.com/excel-pandas-comp.html

You can also handle large datasets with pandas; see the answer, from user Jeff (the top answer at the time of writing), to this StackOverflow question for an extensive overview of the process: http://stackoverflow.com/questions/14262433/large-data-work-flows-using-pandas ...

Get Python: Real-World Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.