Chapter 8. Sequence-to-Sequence Mapping

In this chapter we’ll look at using sequence-to-sequence networks to learn transformations between pieces of text. This is a relatively new technique with tantalizing possibilities. Google claims to have made huge improvements to its Google Translate product using this technique; moreover, it has open sourced a version that can learn language translations purely based on parallel texts.

We won’t go that far to start with. Instead, we’ll start out with a simple model that learns the rules for pluralization in English. After that we’ll extract dialogue from 19th-century novels from Project Gutenberg and train a chatbot on them. For this last project we’ll have to abandon the safety of Keras running in a notebook and will use Google’s open source seq2seq toolkit.

The following notebooks contain the code relevant for this chapter:

08.1 Sequence to sequence mapping
08.2 Import Gutenberg
08.3 Subword tokenizing

8.1 Training a Simple Sequence-to-Sequence Model

Problem

How do you train a model to reverse engineer a transformation?

Solution

Use a sequence-to-sequence mapper.

In Chapter 5 we saw how we can use recurrent networks to “learn” the rules of a sequence. The model learns how to best represent a sequence such that it can predict what the next element will be. Sequence-to-sequence mapping builds on this, but now the model learns to predict a different sequence based on the first one.

We can use this to learn all kinds ...

Get Deep Learning Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.