You are previewing Statistical Machine Translation.
O'Reilly logo
Statistical Machine Translation

Book Description

The dream of automatic language translation is now closer thanks to recent advances in the techniques that underpin statistical machine translation. This class-tested textbook from an active researcher in the field, provides a clear and careful introduction to the latest methods and explains how to build machine translation systems for any two languages. It introduces the subject's building blocks from linguistics and probability, then covers the major models for machine translation: word-based, phrase-based, and tree-based, as well as machine translation evaluation, language modeling, discriminative training and advanced methods to integrate linguistic annotation. The book also reports the latest research, presents the major outstanding challenges, and enables novices as well as experienced researchers to make novel contributions to this exciting area. Ideal for students at undergraduate and graduate level, or for anyone interested in the latest developments in machine translation.

Table of Contents

  1. Cover
  2. Title
  3. Copyright
  4. Contents
  5. Preface
  6. I Foundations
    1. 1 Introduction
      1. 1.1 Overview
      2. 1.2 History of Machine Translation
      3. 1.3 Applications
      4. 1.4 Available Resources
      5. 1.5 Summary
    2. 2 Words, Sentences, Corpora
      1. 2.1 Words
      2. 2.2 Sentences
      3. 2.3 Corpora
      4. 2.4 Summary
    3. 3 Probability Theory
      1. 3.1 Estimating Probability Distributions
      2. 3.2 Calculating Probability Distributions
      3. 3.3 Properties of Probability Distributions
      4. 3.4 Summary
  7. II Core Methods
    1. 4 Word-Based Models
      1. 4.1 Machine Translation by Translating Words
      2. 4.2 Learning Lexical Translation Models
      3. 4.3 Ensuring Fluent Output
      4. 4.4 Higher IBM Models
      5. 4.5 Word Alignment
      6. 4.6 Summary
    2. 5 Phrase-Based Models
      1. 5.1 Standard Model
      2. 5.2 Learning a Phrase Translation Table
      3. 5.3 Extensions to the Translation Model
      4. 5.4 Extensions to the Reordering Model
      5. 5.5 EM Training of Phrase-Based Models
      6. 5.6 Summary
    3. 6 Decoding
      1. 6.1 Translation Process
      2. 6.2 Beam Search
      3. 6.3 Future Cost Estimation
      4. 6.4 Other Decoding Algorithms
      5. 6.5 Summary
    4. 7 Language Models
      1. 7.1 N-Gram Language Models
      2. 7.2 Count Smoothing
      3. 7.3 Interpolation and Back-off
      4. 7.4 Managing the Size of the Model
      5. 7.5 Summary
    5. 8 Evaluation
      1. 8.1 Manual Evaluation
      2. 8.2 Automatic Evaluation
      3. 8.3 Hypothesis Testing
      4. 8.4 Task-Oriented Evaluation
      5. 8.5 Summary
  8. III Advanced Topics
    1. 9 Discriminative Training
      1. 9.1 Finding Candidate Translations
      2. 9.2 Principles of Discriminative Methods
      3. 9.3 Parameter Tuning
      4. 9.4 Large-Scale Discriminative Training
      5. 9.5 Posterior Methods and System Combination
      6. 9.6 Summary
    2. 10 Integrating Linguistic Information
      1. 10.1 Transliteration
      2. 10.2 Morphology
      3. 10.3 Syntactic Restructuring
      4. 10.4 Syntactic Features
      5. 10.5 Factored Translation Models
      6. 10.6 Summary
    3. 11 Tree-Based Models
      1. 11.1 Synchronous Grammars
      2. 11.2 Learning Synchronous Grammars
      3. 11.3 Decoding by Parsing
      4. 11.4 Summary
  9. Bibliography
  10. Author Index
  11. Index