Chapter 6. Sequence Modeling for Natural Language Processing

A sequence is an ordered collection of items. Traditional machine learning assumes data points to be independently and identically distributed (IID), but in many situations, like with language, speech, and time-series data, one data item depends on the items that precede or follow it. Such data is also called sequence data. Sequential information is everywhere in human language. For example, speech can be considered a sequence of basic units called phonemes. In a language like English, words in a sentence are not haphazard. They might be constrained by the words that come before or after them. For example, in the English language, the preposition “of” is likely followed by the article “the”; for example, “The lion is the king of the jungle.” As another example, in many languages, including English, the number of a verb must agree with the number of the subject in a sentence. Here’s an example:

The book is on the table
The books are on the table.

Sometimes these dependencies or constraints can be arbitrarily long. For example:

The book that I got yesterday is on the table.
The books read by the second-grade children are shelved in the lower rack.

In short, understanding sequences is essential to understanding human language. In the previous chapters, you were introduced to feed-forward neural networks, like multilayer perceptrons and convolutional neural networks, and to the power of vector representations. Although ...

Get Natural Language Processing with PyTorch now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.