In this chapter, we will demonstrate again climbing the data-value pyramid by incorporating a new source of data to improve our predictive model: the weather.
In practice, you will climb up and down the data-value pyramid as you operate your business and improve your analytics product. In this chapter we demonstrate the
Code examples for this chapter are available at https://github.com/rjurney/Agile_Data_Code_2/tree/master/ch10. Clone the repository and follow along!
git clone https://github.com/rjurney/Agile_Data_Code_2.git
Many flight delays are weather related, so a big determiner of flight on-time performance is the weather at the departing and arriving airports, and in between. We’ll restrict our investigation at this point to the weather at the pair of airports for that flight. A further iteration might determine the flight path and the weather along it.
To use weather data, we’ll need to acquire historical weather data for every airport in the United States to train our model, as well as weather forecast data to feed our model to make predictions about the future. Fortunately, there is an abundance of open weather data available, both current and historical.
The National Center for Environmental Information (NCEI), formerly the National ...