Coding the hidden layers for our example

For our example problem, I'll use five hidden layers because I think there are lots of interactions between features. My hunch is primarily based on domain knowledge. Having read the data description, I know this is a cross-sectional slice of a time series and maybe auto correlated.

I'll start with 128 neurons on the first layer (slightly fewer than my input size) and then collapse down to 16 by halves as we get toward the output. This isn't at all a rule of thumb, it's based on my own experience alone. We will use the following code to define our hidden layers:

x = Dense(128, activation='relu', name="hidden1")(inputs)x = Dense(64, activation='relu', name="hidden2")(x)x = Dense(64, activation='relu', ...

Get Deep Learning Quick Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.