We can construct a date-based slicing function now that our dataframe is indexed by a datetime timestamp. To do so, we will define a Boolean mask and use that mask to select the existing dataframe. While we could certainly construct this in one line, I think it's a little easier to read this way, as shown in the following code:
def select_dates(df, start, end): mask = (df.index > start) & (df.index <= end) return df[mask]
Now that we can grab portions of the dataframe using dates, we can easily create a training and test dataframe with a few calls to these functions, using the following code:
df = read_data()df_train = select_dates(df, start="2017-01-01", end="2017-05-31")df_test = select_dates(df, start="2017-06-01" ...