The main limitation of a perceptron is its linearity. How is it possible to exploit this kind of architecture by removing such a constraint? The solution is easier than any speculation. Adding at least a non-linear layer between input and output leads to a highly non-linear combination, parametrized with a larger number of variables. The resulting architecture, called Multilayer Perceptron (MLP) and containing a single (only for simplicity) Hidden Layer, is shown in the following diagram:
This is a so-called feed-forward network, meaning that the flow of information begins in the first layer, proceeds always in the same ...