The process of feature extraction is primarily built on removing the last few layers of the neural network.
Refer to the neural network we developed in the Introduction to Neural Networks chapter when trying to solve the MNIST problem:
arch = @mx.chain mx.Variable(:data) => mx.FullyConnected(num_hidden=64) => mx.FullyConnected(num_hidden=10) => mx.SoftmaxOutput(mx.Variable(:label))
In this network, the SotmaxOutput and FullyConnected layers with 10 neurons are task-specific. The other FullyConnected layer with 64 neurons is the one keeping the features.
In the following activities, we will be removing the FullyConnected layers, defining the number of classes, and using the SoftmaxOutput ...