The interesting thing about this dataset is that each comment can have multiples labels. For instance, a comment could be insulting and toxic, or it could be obscene and have identity_hate elements in it.
Hence, we are leveling up here by trying to predict not one label (such as positive or negative), but multiple labels in one go. For each label, we'll predict a value between 0 and 1 to indicate how likely it is to belong to that category.
This is not a probability value in the Bayesian meaning of the word, but represents the same intent.
Let's preview the test dataset as well using ...