Is a deep network really better than an MLP on this problem? Let's find out! After training for 500 epochs, here's how the model performed:
Model Train MAE: 0.0753991873787Model Val MAE: 0.189703853999Model Test MAE: 0.190189985043
We can see that the Train MAE has now decreased from 0.19 to 0.075. We've greatly reduced the bias of the network.
However, our variance has increased. The difference between the training error and validation error is much larger. Our Val set error did move down slightly, which is good; however, this large gap between training error and validation error suggests we are starting to over fit on the training set.
The most straightforward way to reduce variance in cases ...