We first need to compute the style, or Gram matrix, by computing the matrix of dot products from the unrolled filter matrix.
The style loss for the hidden layer a can be represented as the following:
We want to minimize the distance between the Gram matrices for the images S and G. The overall weighted style loss (which we want to minimize) is represented as the following:
Here, λ represents the weights for different layers. Bear the following in mind:
- The style of an image can be represented using the Gram matrix ...