Wrapping Up

Encodings can be a tricky issue. But generally, the landscape is better than it was a few years ago. The world is slowly but surely moving to UTF-8, and as we inch closer to that point we also approach the point where we no longer have to worry about character encoding for the vast majority of our programs.

Before that point, though, knowing how to recover from invalidly encoded strings without losing data is an important string to have in your bow.

Thankfully, Ruby too is moving in the right direction. From being unable to support anything but ASCII text in version 1.8, it now boasts among the best multilingualization support of any programming language. We can handle infinitely different string encodings in our scripts all at ...

Get Text Processing with Ruby now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.