Summary

This chapter was more than an introduction to the Gensim API. We now know how to load pre-trained GloVe vectors, and you can use these vector representations instead of TD-IDF in any machine learning model.

We looked at why fastText vectors are often better than word2vec vectors on a small training corpus, and learned that you can use them with any ML models.

We learned how to build doc2vec models. You can now extend this doc2vec approach to build sent2vec or paragraph2vec style models as well. Ideally, paragraph2vec will change, simply because each document will be a paragraph instead.

In addition, we now know how we can quickly perform sanity checks on our doc2vec vectors without using an annotated test corpora. We did this by checking ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.