Loading the dataset

While maybe not the most fun part of a machine learning problem, loading the data is an important step. I'm going to cover my data loading methodology here so that you can get a feel for how I handle loading a dataset.

from sklearn.preprocessing import StandardScalerimport pandas as pdTRAIN_DATA = "./data/train/train_data.csv"VAL_DATA = "./data/val/val_data.csv"TEST_DATA = "./data/test/test_data.csv"def load_data(): """Loads train, val, and test datasets from disk""" train = pd.read_csv(TRAIN_DATA) val = pd.read_csv(VAL_DATA) test = pd.read_csv(TEST_DATA) # we will use sklearn's StandardScaler to scale our data to 0 mean, unit variance. scaler = StandardScaler() train = scaler.fit_transform(train) val = scaler.transform(val) ...

Get Deep Learning Quick Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.