O'Reilly logo

Text Processing with Ruby by Rob Miller

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Ruby’s Support for Character Encodings

Ruby’s support for character encodings has progressed slowly but surely in recent versions. Explaining its support for encodings in the latest version is best done by taking a journey from Ruby 1.8, which had no support for any character encodings apart from US-ASCII, to the rather robust support you can find in the latest version of Ruby.

Ruby 1.8

In Ruby 1.8, released in 2003, there was essentially no support for character encodings at all. Source files were always interpreted as US-ASCII, and methods that operated on strings would often get confused when they encountered multi-byte characters. For example, look at the following code when run using Ruby 1.8:

 
"Hellø"​.reverse ​# => "\270\303lleH"

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required