O'Reilly logo

Fonts & Encodings by Yannis Haralambous

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

5.3. Conversion of Text from One Encoding to Another

There are few tools for converting text, no doubt because text editors (such as BBEdit and Ultra-Edit) and word-processing packages (such as MS Word and Corel WordPerfect) handle this process internally. There is a free library of subroutines devoted to converting between encodings: libiconv, developed by Bruno Haible. The GNU software provided with this library that performs conversions is called iconv.

Figure 5-10. The main window of XKeyCaps

5.3.1. The recode Utility

In this section we shall describe a program with a long history (its origins, under a different name, go back to the 1970s) that today is based on libiconv: recode, by the Quèbècois François Pinard [293].

To convert a file foo.txt, all that we need is to find the source encoding A and the target encoding B on the list of encodings supported by recode and write:

    recode A..B foo.txt

The file foo.txt will be overwritten. We can also write:

    recode A..B < foo.txt > foo-converted.txt

In fact, we can go through multiple steps:

    recode A..B..C..D..E foo.txt

What is even more interesting is that recode refers to surface, which is roughly the equivalent of Unicode's serialization mechanisms (see page 62)—a technique for transmitting data without changing the encoding. If S is a serialization mechanism, we can write:

    recode A..B/S foo.txt

and, in addition to the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required