Training fastText embedddings

Setting up imports is actually quite simple in the new Gensim API; just use the following code:

from gensim.models.fasttext import FastText

The next step is to feed the text and make our text embedding model, as follows:

fasttext_ted_model = FastText(sentences_ted, size=100, window=5, min_count=5, workers=-1, sg=1) # sg = 1 denotes skipgram, else CBOW is used

You will probably noticed the parameters we pass to make our model. The following list explains these parameters, as explained in the Gensim documentation:

  • min_count (int, optional): The model ignores all words with total frequency lower than this
  • size (int, optional): This represents the dimensionality of word vectors
  • window (int, optional): This represents ...

