Quick wins – proof of concept

We want to quickly spot the types of algorithms and dataset combinations that sort of work for us. We can then focus on them and study them in greater detail.

The results from here will help you estimate the amount of work ahead of you. For instance, if you are going to develop a search system for documents based exclusively on keywords, your main effort will probably be deploying an open source solution such as ElasticSearch.

Let's say that you now want to add a similar documents feature. Depending on the expected quality of results, you will want to look into techniques such as doc2vec and word2vec, or even some convolutional neural network solution using Keras/Tensorflow or PyTorch.

This step is essential ...

Get Natural Language Processing with Python Quick Start Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.