6

Character Mapping and Code Sets

Lost In Translation?

Using mixed character sets can be hazardous to your health. Back in the 1970s and 80s, computer systems from different manufacturers were much harder to integrate into a single workflow. We had problems getting data moved from IBM to DEC computer systems. This is partly because the character codes were represented using different mapping schemes. On DEC VAX systems we used ASCII, while the IBM systems used EBCDIC that was completely different.

While mapping characters one-to-one wasn’t difficult, dealing with missing characters presented a serious problem. In EBCDIC, we had a character for U.S. cents (¢). It occupied the same position as UK Pound signs (£). There was already an ambiguous ...

Get Developing Quality Metadata now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.