Chapter 12. Deep Learning and Beyond

In this book, we have made an effort to emphasize techniques and tools that are sufficiently robust to support practical applications. At times this has meant skimming over promising though less mature libraries and those intended primarily for individual research. Instead, we have favored tools that scale easily from ad hoc analyses on a single machine to large clusters managing interactions for many hundreds of thousands of users. In the last chapter, we explored several such tools, from the Python multiprocessing library to the powerhouse Spark, which enable us to run many models in parallel, and do so rapidly enough to engage large-scale production applications. In this chapter we will discuss an equally significant advancement, neural networks, which are quickly becoming the new state of the art in natural language processing.

Ironically, neural networks are in some sense one of the most “old school” technologies covered in this book, with computational roots dating back to work done nearly 70 years ago. For most of this history, neural networks could not have been considered a practical machine learning method. However, this has changed rapidly over the last two decades thanks to three main advances: first, the dramatic increases in compute power made possible with GPUs and distributed computing in the early 2000s; then, the optimizations in learning rates over the last decade, which we’ll discuss later in the chapter; and finally, with ...

Get Applied Text Analysis with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.