What can be done to improve the network? We can introduce the convolution neural network (CNN). It is a neural network design architecture, which is part of all states of the art solution in image, text, and sound-based classification. The fundamental principle behind CNN is the purpose of convolution, which creates filtered feature maps stacked over each other.
In the following example, every convolution layer will be followed by an activation function to introduce non-linearity:
arch = @mx.chain mx.Variable(:data) => mx.Convolution(kernel=(8, 8), num_filter=16, stride = (4, 4)) => mx.Activation(act_type=:relu) => mx.Convolution(kernel=(4, 4), num_filter=32, stride = (2, 2)) => mx.Activation(act_type=:relu) => mx.FullyConnected(num_hidden=256) ...