We learned from the previous chapter that neural networks are made up of neurons, which have weights and biases learned over a training dataset. This network is organized into layers where each layer is composed of a number of different neurons. Neurons in each layer are connected to neurons in the next layer through a set of edges that carry a weight that is learned from a training dataset. Each neuron also has a pre-selected activation function. For every input it receives, a neuron computes its dot product with its learned weight and passes it through its activation function to generate a response.
Though this architecture works well for small-scale datasets, it has a scale challenge: