Chapter 1. Before Unicode

When we need precise definitions of computer-related concepts that seem a little fuzzy to us, nothing is more refreshing and beneficial than consulting old documents. For example, in C.E. MacKenzie's Coded Character Sets, History and Development (1980) [242], we find the following definitions (slightly condensed here):

  • a bit is a binary digit, either 0 or 1;

  • a bit pattern is an ordered set of bits, usually of fixed length;

  • a byte is a bit pattern of fixed length; thus we speak of 8-bit bytes, 6-bit bytes, and so on;

  • a graphic is a particular shape, printed, typed, or displayed, that represents an alphabetic, numeric or special symbol;

  • a character is a specific bit pattern and a meaning assigned to it: a graphic character has an assigned graphic meaning, and a control character has an assigned control meaning;

  • a bit code is a specific set of bit patterns to which either graphic or control meanings have been asigned;

  • a code table is a compact matrix form of rows and columns for exhibiting the bit patterns and assigned meanings of a code;

  • a shifted code is a code in which the meaning of a bit pattern depends not only on the bit pattern itself, but also on the fact that it has been preceded in the data stream by some other particular bit pattern, which is called a shift character.

All this makes sense; only the terminology has slightly changed. Nowadays a byte is always considered to be of fixed length 8; what MacKenzie calls a "graphic" is now called a glyph; a ...

Get Fonts & Encodings now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.