Data exploration and preparation

In your experiment, drag the Flight Delays Data sample dataset and click on the Visualize option to explore the dataset. You can find that some columns have lots of missing values. You can clean the missing data using a Clean Missing Data module by replacing it with MICE as the cleaning mode.

There are certain columns, such as DayOfWeek, OriginAirportID, and DestAirportID which contain continuous numbers; however, they are categorical variables. So, use the Metadata Editor module to set them as Categorical.

Feature selection

Before you start developing the model, it is important to select or generate a set of variables that have the most predictive power and remove any redundant and not so important features. In ...

Get Microsoft Azure Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.