Chapter 8. Sequence-to-Sequence Mapping
In this chapter weâll look at using sequence-to-sequence networks to learn transformations between pieces of text. This is a relatively new technique with tantalizing possibilities. Google claims to have made huge improvements to its Google Translate product using this technique; moreover, it has open sourced a version that can learn language translations purely based on parallel texts.
We wonât go that far to start with. Instead, weâll start out with a simple model that learns the rules for pluralization in English. After that weâll extract dialogue from 19th-century novels from Project Gutenberg and train a chatbot on them. For this last project weâll have to abandon the safety of Keras running in a notebook and will use Googleâs open source seq2seq toolkit.
The following notebooks contain the code relevant for this chapter:
08.1 Sequence to sequence mapping 08.2 Import Gutenberg 08.3 Subword tokenizing
8.1 Training a Simple Sequence-to-Sequence Model
Problem
How do you train a model to reverse engineer a transformation?
Solution
Use a sequence-to-sequence mapper.
In Chapter 5 we saw how we can use recurrent networks to âlearnâ the rules of a sequence. The model learns how to best represent a sequence such that it can predict what the next element will be. Sequence-to-sequence mapping builds on this, but now the model learns to predict a different sequence based on the first one.
We can use this to learn all kinds ...
Get Deep Learning Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.