fter applying the softmax function,  becomes a vector of the same dimension as  (the corpus size 20,000) with all its elements having a total sum of 1.


...works because also h is a vector. That was not totally obvious from the start....