2.5 PLAINTEXT LANGUAGE MODELS

Natural languages have statistical characteristics that are generally reflected in the ciphertext. We will show how these characteristics may be recognized and used to recover the plaintext and key from columnar transposition ciphertext.

We assume a language model in which plaintext, with letters in a generic alphabet image, is generated by a statistical source (Fig. 2.2). The iid source is the simplest example of a language model; it generates plaintext as a result of independent and identically distributed trails of a chance experiment. The iid source generates the plaintext n-gram X = (X0, X1, …, Xn −1) with probability

image

Figure 2.2 Generic statistical plaintext source.

image

For example, the probability of the ASCII plaintext Good morning is

image

where π is a probability distribution on the plaintext letters. As the iid source generates letters independently, plaintexts that differ only by the arrangement of their letters are assigned the same probability; that is, Pr{Good morning} = Pr{Gd moogninr}.

Because columnar transposition enciphers plaintext by rearranging the ...

Get Computer Security and Cryptography now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.