When we give a
variable the value
'A', what exactly is
We’ve already alluded to the fact that there is some kind of encoding going on—remember that we mentioned the IBM-derived Latin1 scheme when we were discussing escaped character literals.
Computers work with binary values, typically made up of one or more bytes, and we clearly need some kind of mapping between the binary values in these bytes and the characters we want them to represent. We’ve all got to agree on what the binary values mean, or we can’t exchange information. To that end, the American Standards Association convened a committee in the 1960s which defined (and then redefined, tweaked, and generally improved over subsequent decades) a standard called ASCII (pronounced ass-key): the American Standard Code for Information Interchange.
This defined 128 characters, represented using 7 bits of a byte. The first 32 values from 0x00–0x19, and also the very last value, 0x7F, are called control characters, and include things like the tab character (0x09), backspace (0x09), bell (0x07), and delete (0x7F).
The rest are called the printable characters, and include space (0x20), which is not a control character, but a “blank” printable character; all the upper and lowercase letters; and most of the punctuation marks in common use in English.
This was a start, but it rapidly became apparent that ASCII did not have enough characters to deal with a lot of the common Western (“Latin”) scripts; ...