O'Reilly logo

Statistical Machine Translation by Philipp Koehn

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4

Word-Based Models

In this chapter, we discuss word-based models. The models stem from the original work on statistical machine translation by the IBM Candide project in the late 1980s and early 1990s. While this approach does not constitute the state of the art anymore, many of the principles and methods are still current today.

Reviewing this seminal work will introduce many concepts that underpin other statistical machine translation models, such as generative modeling, the expectation maximization algorithm, and the noisy-channel model. At the end of the chapter, we will also look at word alignment as a problem in itself.

4.1 Machine Translation by Translating Words

We start this chapter with a simple model for machine translation ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required