You are previewing Ruby Cookbook.

Ruby Cookbook

Cover of Ruby Cookbook by Lucas Carlson... Published by O'Reilly Media, Inc.
  1. Ruby Cookbook
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Preface
      1. 1. Life Is Short
      2. 2. Audience
      3. 3. The Structure of This Book
      4. 4. How the Code Listings Work
      5. 5. Installing the Software
      6. 6. Platform Differences, Version Differences, and Other Headaches
      7. 7. Other Resources
      8. 8. Conventions Used in This Book
      9. 9. Using Code Examples
      10. 10. Comments and Questions
      11. 11. Acknowledgments
    4. 1. Strings
      1. 1.1. Building a String from Parts
      2. 1.2. Substituting Variables into Strings
      3. 1.3. Substituting Variables into an Existing String
      4. 1.4. Reversing a String by Words or Characters
      5. 1.5. Representing Unprintable Characters
      6. 1.6. Converting Between Characters and Values
      7. 1.7. Converting Between Strings and Symbols
      8. 1.8. Processing a String One Character at a Time
      9. 1.9. Processing a String One Word at a Time
      10. 1.10. Changing the Case of a String
      11. 1.11. Managing Whitespace
      12. 1.12. Testing Whether an Object Is String-Like
      13. 1.13. Getting the Parts of a String You Want
      14. 1.14. Handling International Encodings
      15. 1.15. Word-Wrapping Lines of Text
      16. 1.16. Generating a Succession of Strings
      17. 1.17. Matching Strings with Regular Expressions
      18. 1.18. Replacing Multiple Patterns in a Single Pass
      19. 1.19. Validating an Email Address
      20. 1.20. Classifying Text with a Bayesian Analyzer
    5. 2. Numbers
      1. 2.1. Parsing a Number from a String
      2. 2.2. Comparing Floating-Point Numbers
      3. 2.3. Representing Numbers to Arbitrary Precision
      4. 2.4. Representing Rational Numbers
      5. 2.5. Generating Random Numbers
      6. 2.6. Converting Between Numeric Bases
      7. 2.7. Taking Logarithms
      8. 2.8. Finding Mean, Median, and Mode
      9. 2.9. Converting Between Degrees and Radians
      10. 2.10. Multiplying Matrices
      11. 2.11. Solving a System of Linear Equations
      12. 2.12. Using Complex Numbers
      13. 2.13. Simulating a Subclass of Fixnum
      14. 2.14. Doing Math with Roman Numbers
      15. 2.15. Generating a Sequence of Numbers
      16. 2.16. Generating Prime Numbers
      17. 2.17. Checking a Credit Card Checksum
    6. 3. Date and Time
      1. 3.1. Finding Today's Date
      2. 3.2. Parsing Dates, Precisely or Fuzzily
      3. 3.3. Printing a Date
      4. 3.4. Iterating Over Dates
      5. 3.5. Doing Date Arithmetic
      6. 3.6. Counting the Days Since an Arbitrary Date
      7. 3.7. Converting Between Time Zones
      8. 3.8. Checking Whether Daylight Saving Time Is in Effect
      9. 3.9. Converting Between Time and DateTime Objects
      10. 3.10. Finding the Day of the Week
      11. 3.11. Handling Commercial Dates
      12. 3.12. Running a Code Block Periodically
      13. 3.13. Waiting a Certain Amount of Time
      14. 3.14. Adding a Timeout to a Long-Running Operation
    7. 4. Arrays
      1. 4.1. Iterating Over an Array
      2. 4.2. Rearranging Values Without Using Temporary Variables
      3. 4.3. Stripping Duplicate Elements from an Array
      4. 4.4. Reversing an Array
      5. 4.5. Sorting an Array
      6. 4.6. Ignoring Case When Sorting Strings
      7. 4.7. Making Sure a Sorted Array Stays Sorted
      8. 4.8. Summing the Items of an Array
      9. 4.9. Sorting an Array by Frequency of Appearance
      10. 4.10. Shuffling an Array
      11. 4.11. Getting the N Smallest Items of an Array
      12. 4.12. Building Up a Hash Using Injection
      13. 4.13. Extracting Portions of Arrays
      14. 4.14. Computing Set Operations on Arrays
      15. 4.15. Partitioning or Classifying a Set
    8. 5. Hashes
      1. 5.1. Using Symbols as Hash Keys
      2. 5.2. Creating a Hash with a Default Value
      3. 5.3. Adding Elements to a Hash
      4. 5.4. Removing Elements from a Hash
      5. 5.5. Using an Array or Other Modifiable Object as a Hash Key
      6. 5.6. Keeping Multiple Values for the Same Hash Key
      7. 5.7. Iterating Over a Hash
      8. 5.8. Iterating Over a Hash in Insertion Order
      9. 5.9. Printing a Hash
      10. 5.10. Inverting a Hash
      11. 5.11. Choosing Randomly from a Weighted List
      12. 5.12. Building a Histogram
      13. 5.13. Remapping the Keys and Values of a Hash
      14. 5.14. Extracting Portions of Hashes
      15. 5.15. Searching a Hash with Regular Expressions
    9. 6. Files and Directories
      1. 6.1. Checking to See If a File Exists
      2. 6.2. Checking Your Access to a File
      3. 6.3. Changing the Permissions on a File
      4. 6.4. Seeing When a File Was Last Used Problem
      5. 6.5. Listing a Directory
      6. 6.6. Reading the Contents of a File
      7. 6.7. Writing to a File
      8. 6.8. Writing to a Temporary File
      9. 6.9. Picking a Random Line from a File
      10. 6.10. Comparing Two Files
      11. 6.11. Performing Random Access on "Read-Once" Input Streams
      12. 6.12. Walking a Directory Tree
      13. 6.13. Locking a File
      14. 6.14. Backing Up to Versioned Filenames
      15. 6.15. Pretending a String Is a File
      16. 6.16. Redirecting Standard Input or Output
      17. 6.17. Processing a Binary File
      18. 6.18. Deleting a File
      19. 6.19. Truncating a File
      20. 6.20. Finding the Files You Want
      21. 6.21. Finding and Changing the Current Working Directory
    10. 7. Code Blocks and Iteration
      1. 7.1. Creating and Invoking a Block
      2. 7.2. Writing a Method That Accepts a Block
      3. 7.3. Binding a Block Argument to a Variable
      4. 7.4. Blocks as Closures: Using Outside Variables Within a Code Block
      5. 7.5. Writing an Iterator Over a Data Structure
      6. 7.6. Changing the Way an Object Iterates
      7. 7.7. Writing Block Methods That Classify or Collect
      8. 7.8. Stopping an Iteration
      9. 7.9. Looping Through Multiple Iterables in Parallel
      10. 7.10. Hiding Setup and Cleanup in a Block Method
      11. 7.11. Coupling Systems Loosely with Callbacks
    11. 8. Objects and Classes
      1. 8.1. Managing Instance Data
      2. 8.2. Managing Class Data
      3. 8.3. Checking Class or Module Membership
      4. 8.4. Writing an Inherited Class
      5. 8.5. Overloading Methods
      6. 8.6. Validating and Modifying Attribute Values
      7. 8.7. Defining a Virtual Attribute
      8. 8.8. Delegating Method Calls to Another Object
      9. 8.9. Converting and Coercing Objects to Different Types
      10. 8.10. Getting a Human-Readable Printout of Any Object
      11. 8.11. Accepting or Passing a Variable Number of Arguments
      12. 8.12. Simulating Keyword Arguments
      13. 8.13. Calling a Superclass's Method
      14. 8.14. Creating an Abstract Method
      15. 8.15. Freezing an Object to Prevent Changes
      16. 8.16. Making a Copy of an Object
      17. 8.17. Declaring Constants
      18. 8.18. Implementing Class and Singleton Methods
      19. 8.19. Controlling Access by Making Methods Private
    12. 9. Modules and Namespaces
      1. 9.1. Simulating Multiple Inheritance with Mixins
      2. 9.2. Extending Specific Objects with Modules
      3. 9.3. Mixing in Class Methods
      4. 9.4. Implementing Enumerable: Write One Method, Get 22 Free
      5. 9.5. Avoiding Naming Collisions with Namespaces
      6. 9.6. Automatically Loading Libraries as Needed
      7. 9.7. Including Namespaces
      8. 9.8. Initializing Instance Variables Defined by a Module
      9. 9.9. Automatically Initializing Mixed-In Modules
    13. 10. Reflection and Metaprogramming
      1. 10.1. Finding an Object's Class and Superclass
      2. 10.2. Listing an Object's Methods
      3. 10.3. Listing Methods Unique to an Object
      4. 10.4. Getting a Reference to a Method
      5. 10.5. Fixing Bugs in Someone Else's Class
      6. 10.6. Listening for Changes to a Class
      7. 10.7. Checking Whether an Object Has Necessary Attributes
      8. 10.8. Responding to Calls to Undefined Methods
      9. 10.9. Automatically Initializing Instance Variables
      10. 10.10. Avoiding Boilerplate Code with Metaprogramming
      11. 10.11. Metaprogramming with String Evaluations
      12. 10.12. Evaluating Code in an Earlier Context
      13. 10.13. Undefining a Method
      14. 10.14. Aliasing Methods
      15. 10.15. Doing Aspect-Oriented Programming
      16. 10.16. Enforcing Software Contracts
    14. 11. XML and HTML
      1. 11.1. Checking XML Well-Formedness
      2. 11.2. Extracting Data from a Document's Tree Structure
      3. 11.3. Extracting Data While Parsing a Document
      4. 11.4. Navigating a Document with XPath
      5. 11.5. Parsing Invalid Markup
      6. 11.6. Converting an XML Document into a Hash
      7. 11.7. Validating an XML Document
      8. 11.8. Substituting XML Entities
      9. 11.9. Creating and Modifying XML Documents
      10. 11.10. Compressing Whitespace in an XML Document
      11. 11.11. Guessing a Document's Encoding
      12. 11.12. Converting from One Encoding to Another
      13. 11.13. Extracting All the URLs from an HTML Document
      14. 11.14. Transforming Plain Text to HTML
      15. 11.15. Converting HTML Documents from the Web into Text
      16. 11.16. A Simple Feed Aggregator
    15. 12. Graphics and Other File Formats
      1. 12.1. Thumbnailing Images
      2. 12.2. Adding Text to an Image
      3. 12.3. Converting One Image Format to Another
      4. 12.4. Graphing Data
      5. 12.5. Adding Graphical Context with Sparklines
      6. 12.6. Strongly Encrypting Data
      7. 12.7. Parsing Comma-Separated Data
      8. 12.8. Parsing Not-Quite-Comma-Separated Data
      9. 12.9. Generating and Parsing Excel Spreadsheets
      10. 12.10. Compressing and Archiving Files with Gzip and Tar
      11. 12.11. Reading and Writing ZIP Files
      12. 12.12. Reading and Writing Configuration Files
      13. 12.13. Generating PDF Files
      14. 12.14. Representing Data as MIDI Music
    16. 13. Databases and Persistence
      1. 13.1. Serializing Data with YAML
      2. 13.2. Serializing Data with Marshal
      3. 13.3. Persisting Objects with Madeleine
      4. 13.4. Indexing Unstructured Text with SimpleSearch
      5. 13.5. Indexing Structured Text with Ferret
      6. 13.6. Using Berkeley DB Databases
      7. 13.7. Controlling MySQL on Unix
      8. 13.8. Finding the Number of Rows Returned by a Query
      9. 13.9. Talking Directly to a MySQL Database
      10. 13.10. Talking Directly to a PostgreSQL Database
      11. 13.11. Using Object Relational Mapping with ActiveRecord
      12. 13.12. Using Object Relational Mapping with Og
      13. 13.13. Building Queries Programmatically
      14. 13.14. Validating Data with ActiveRecord
      15. 13.15. Preventing SQL Injection Attacks
      16. 13.16. Using Transactions in ActiveRecord
      17. 13.17. Adding Hooks to Table Events
      18. 13.18. Adding Taggability with a Database Mixin
    17. 14. Internet Services
      1. 14.1. Grabbing the Contents of a Web Page
      2. 14.2. Making an HTTPS Web Request
      3. 14.3. Customizing HTTP Request Headers
      4. 14.4. Performing DNS Queries
      5. 14.5. Sending Mail
      6. 14.6. Reading Mail with IMAP
      7. 14.7. Reading Mail with POP3
      8. 14.8. Being an FTP Client
      9. 14.9. Being a Telnet Client
      10. 14.10. Being an SSH Client
      11. 14.11. Copying a File to Another Machine
      12. 14.12. Being a BitTorrent Client
      13. 14.13. Pinging a Machine
      14. 14.14. Writing an Internet Server
      15. 14.15. Parsing URLs
      16. 14.16. Writing a CGI Script
      17. 14.17. Setting Cookies and Other HTTP Response Headers
      18. 14.18. Handling File Uploads via CGI
      19. 14.19. Running Servlets with WEBrick
      20. 14.20. A Real-World HTTP Client
    18. 15. Web Development: Ruby on Rails
      1. 15.1. Writing a Simple Rails Application to Show System Status
      2. 15.2. Passing Data from the Controller to the View
      3. 15.3. Creating a Layout for Your Header and Footer
      4. 15.4. Redirecting to a Different Location
      5. 15.5. Displaying Templates with Render
      6. 15.6. Integrating a Database with Your Rails Application
      7. 15.7. Understanding Pluralization Rules
      8. 15.8. Creating a Login System
      9. 15.9. Storing Hashed User Passwords in the Database
      10. 15.10. Escaping HTML and JavaScript for Display
      11. 15.11. Setting and Retrieving Session Information
      12. 15.12. Setting and Retrieving Cookies
      13. 15.13. Extracting Code into Helper Functions
      14. 15.14. Refactoring the View into Partial Snippets of Views
      15. 15.15. Adding DHTML Effects with
      16. 15.16. Generating Forms for Manipulating Model Objects
      17. 15.17. Creating an Ajax Form
      18. 15.18. Exposing Web Services on Your Web Site
      19. 15.19. Sending Mail with Rails
      20. 15.20. Automatically Sending Error Messages to Your Email
      21. 15.21. Documenting Your Web Site
      22. 15.22. Unit Testing Your Web Site
      23. 15.23. Using breakpoint in Your Web Application
    19. 16. Web Services and Distributed Programming
      1. 16.1. Searching for Books on Amazon
      2. 16.2. Finding Photos on Flickr
      3. 16.3. Writing an XML-RPC Client
      4. 16.4. Writing a SOAP Client
      5. 16.5. Writing a SOAP Server
      6. 16.6. Searching the Web with Google's SOAP Service
      7. 16.7. Using a WSDL File to Make SOAP Calls Easier
      8. 16.8. Charging a Credit Card
      9. 16.9. Finding the Cost to Ship Packages via UPS or FedEx
      10. 16.10. Sharing a Hash Between Any Number of Computers
      11. 16.11. Implementing a Distributed Queue
      12. 16.12. Creating a Shared "Whiteboard"
      13. 16.13. Securing DRb Services with Access Control Lists
      14. 16.14. Automatically Discovering DRb Services with Rinda
      15. 16.15. Proxying Objects That Can't Be Distributed
      16. 16.16. Storing Data on Distributed RAM with MemCached
      17. 16.17. Caching Expensive Results with MemCached
      18. 16.18. A Remote-Controlled Jukebox
    20. 17. Testing, Debugging, Optimizing, and Documenting
      1. 17.1. Running Code Only in Debug Mode
      2. 17.2. Raising an Exception
      3. 17.3. Handling an Exception
      4. 17.4. Rerunning After an Exception
      5. 17.5. Adding Logging to Your Application
      6. 17.6. Creating and Understanding Tracebacks
      7. 17.7. Writing Unit Tests
      8. 17.8. Running Unit Tests
      9. 17.9. Testing Code That Uses External Resources
      10. 17.10. Using breakpoint to Inspect and Change the State of Your Application
      11. 17.11. Documenting Your Application
      12. 17.12. Profiling Your Application
      13. 17.13. Benchmarking Competing Solutions
      14. 17.14. Running Multiple Analysis Tools at Once
      15. 17.15. Who's Calling That Method? A Call Graph Analyzer
    21. 18. Packaging and Distributing Software
      1. 18.1. Finding Libraries by Querying Gem Respositories
      2. 18.2. Installing and Using a Gem
      3. 18.3. Requiring a Specific Version of a Gem
      4. 18.4. Uninstalling a Gem
      5. 18.5. Reading Documentation for Installed Gems
      6. 18.6. Packaging Your Code as a Gem
      7. 18.7. Distributing Your Gems
      8. 18.8. Installing and Creating Standalone Packages with setup.rb
    22. 19. Automating Tasks with Rake
      1. 19.1. Automatically Running Unit Tests
      2. 19.2. Automatically Generating Documentation
      3. 19.3. Cleaning Up Generated Files
      4. 19.4. Automatically Building a Gem
      5. 19.5. Gathering Statistics About Your Code
      6. 19.6. Publishing Your Documentation
      7. 19.7. Running Multiple Tasks in Parallel
      8. 19.8. A Generic Project Rakefile
    23. 20. Multitasking and Multithreading
      1. 20.1. Running a Daemon Process on Unix
      2. 20.2. Creating a Windows Service
      3. 20.3. Doing Two Things at Once with Threads
      4. 20.4. Synchronizing Access to an Object
      5. 20.5. Terminating a Thread
      6. 20.6. Running a Code Block on Many Objects Simultaneously
      7. 20.7. Limiting Multithreading with a Thread Pool
      8. 20.8. Driving an External Process with popen
      9. 20.9. Capturing the Output and Error Streams from a Unix Shell Command
      10. 20.10. Controlling a Process on Another Machine
      11. 20.11. Avoiding Deadlock
    24. 21. User Interface
      1. 21.1.
      2. 21.2. Getting Input One Line at a Time
      3. 21.3. Getting Input One Character at a Time
      4. 21.4. Parsing Command-Line Arguments
      5. 21.5. Testing Whether a Program Is Running Interactively
      6. 21.6. Setting Up and Tearing Down a Curses Program
      7. 21.7. Clearing the Screen
      8. 21.8. Determining Terminal Size
      9. 21.9. Changing Text Color
      10. 21.10. Reading a Password
      11. 21.11. Allowing Input Editing with Readline
      12. 21.12. Making Your Keyboard Lights Blink
      13. 21.13. Creating a GUI Application with Tk
      14. 21.14. Creating a GUI Application with wxRuby
      15. 21.15. Creating a GUI Application with Ruby/GTK
      16. 21.16. Creating a Mac OS X Application with RubyCocoa
      17. 21.17. Using AppleScript to Get User Input
    25. 22. Extending Ruby with Other Languages
      1. 22.1. Writing a C Extension for Ruby
      2. 22.2. Using a C Library from Ruby
      3. 22.3. Calling a C Library Through SWIG
      4. 22.4. Writing Inline C in Your Ruby Code
      5. 22.5. Using Java Libraries with JRuby
    26. 23. System Administration
      1. 23.1. Scripting an External Program
      2. 23.2. Managing Windows Services
      3. 23.3. Running Code as Another User
      4. 23.4. Running Periodic Tasks Without cron or at
      5. 23.5. Deleting Files That Match a Regular Expression
      6. 23.6. Renaming Files in Bulk
      7. 23.7. Finding Duplicate Files
      8. 23.8. Automating Backups
      9. 23.9. Normalizing Ownership and Permissions in User Directories
      10. 23.10. Killing All Processes for a Given User
    27. Index
    28. About the Authors
    29. Colophon
    30. SPECIAL OFFER: Upgrade this ebook with O’Reilly

Chapter 1. Strings

Ruby is a programmer-friendly language. If you are already familiar with object oriented programming, Ruby should quickly become second nature. If you've struggled with learning object-oriented programming or are not familiar with it, Ruby should make more sense to you than other object-oriented languages because Ruby's methods are consistently named, concise, and generally act the way you expect.

Throughout this book, we demonstrate concepts through interactive Ruby sessions. Strings are a good place to start because not only are they a useful data type, they're easy to create and use. They provide a simple introduction to Ruby, a point of comparison between Ruby and other languages you might know, and an approachable way to introduce important Ruby concepts like duck typing (see Recipe 1.12), open classes (demonstrated in Recipe 1.10), symbols (Recipe 1.7), and even Ruby gems (Recipe 1.20).

If you use Mac OS X or a Unix environment with Ruby installed, go to your command line right now and type irb. If you're using Windows, you can download and install the One-Click Installer from, and do the same from a command prompt (you can also run the fxri program, if that's more comfortable for you). You've now entered an interactive Ruby shell, and you can follow along with the code samples in most of this book's recipes.

Strings in Ruby are much like strings in other dynamic languages like Perl, Python and PHP. They're not too much different from strings in Java and C. Ruby strings are dynamic, mutable, and flexible. Get started with strings by typing this line into your interactive Ruby session:

	string = "My first string"

You should see some output that looks like this:

	=> "My first string"

You typed in a Ruby expression that created a string "My first string", and assigned it to the variable string. The value of that expression is just the new value of string, which is what your interactive Ruby session printed out on the right side of the arrow. Throughout this book, we'll represent this kind of interaction in the following form:[1]

	string = "My first string"                 # => "My first string"

In Ruby, everything that can be assigned to a variable is an object. Here, the variable string points to an object of class String. That class defines over a hundred built-in methods: named pieces of code that examine and manipulate the string. We'll explore some of these throughout the chapter, and indeed the entire book. Let's try out one now: String#length, which returns the number of bytes in a string. Here's a Ruby method call:

	string.length                              # => 15

Many programming languages make you put parentheses after a method call:

	string.length()                            # => 15

In Ruby, parentheses are almost always optional. They're especially optional in this case, since we're not passing any arguments into String#length. If you're passing arguments into a method, it's often more readable to enclose the argument list in parentheses:

	string.count 'i'                           # => 2 # "i" occurs twice.
	string.count('i')                          # => 2

The return value of a method call is itself an object. In the case of String#length, the return value is the number 15, an instance of the Fixnum class. We can call a method on this object as well:                         # => 16

Let's take a more complicated case: a string that contains non-ASCII characters. This string contains the French phrase "il était une fois," encoded as UTF-8:[2]

	french_string = "il \xc3\xa9tait une fois"   # => "il \303\251tait une fois"

Many programming languages (notably Java) treat a string as a series of characters. Ruby treats a string as a series of bytes. The French string contains 14 letters and 3 spaces, so you might think Ruby would say the length of the string is 17. But one of the letters (the e with acute accent) is represented as two bytes, and that's what Ruby counts:

	french_string.length                       # => 18

For more on handling different encodings, see Recipe 1.14 and Recipe 11.12. For more on this specific problem, see Recipe 1.8

You can represent special characters in strings (like the binary data in the French string) with string escaping. Ruby does different types of string escaping depending on how you create the string. When you enclose a string in double quotes, you can encode binary data into the string (as in the French example above), and you can encode newlines with the code "\n", as in other programming languages:

	puts "This string\ncontains a newline"
	# This string
	# contains a newline

When you enclose a string in single quotes, the only special codes you can use are "\'" to get a literal single quote, and "\\" to get a literal backslash:

	puts 'it may look like this string contains a newline\nbut it doesn\'t'
	# it may look like this string contains a newline\nbut it doesn't

	puts 'Here is a backslash: \\'
	# Here is a backslash: \

This is covered in more detail in Recipe 1.5. Also see Recipes 1.2 and 1.3 for more examples of the more spectacular substitutions double-quoted strings can do.

Another useful way to initialize strings is with the " here documents" style:

	long_string = <<EOF
	Here is a long string
	With many paragraphs
	# => "Here is a long string\nWith many paragraphs\n"

	puts long_string
	# Here is a long string
	# With many paragraphs

Like most of Ruby's built-in classes, Ruby's strings define the same functionality in several different ways, so that you can use the idiom you prefer. Say you want to get a substring of a larger string (as in Recipe 1.13). If you're an object-oriented programming purist, you can use the String#slice method:

	string                                     # => "My first string"
	string.slice(3, 5)                         # => "first"

But if you're coming from C, and you think of a string as an array of bytes, Ruby can accommodate you. Selecting a single byte from a string returns that byte as a number.

	string.chr + string.chr + string.chr + string.chr + string.chr
	# => "first"

And if you come from Python, and you like that language's slice notation, you can just as easily chop up the string that way:

	string[3, 5]                              # => "first"

Unlike in most programming languages, Ruby strings are mutable: you can change them after they are declared. Below we see the difference between the methods String#upcase and String#upcase!:

	string.upcase                             # => "MY FIRST STRING"
	string                                    # => "My first string"
	string.upcase!                            # => "MY FIRST STRING"
	string                                    # => "MY FIRST STRING"

This is one of Ruby's syntactical conventions. "Dangerous" methods (generally those that modify their object in place) usually have an exclamation mark at the end of their name. Another syntactical convention is that predicates, methods that return a true/false value, have a question mark at the end of their name (as in some varieties of Lisp):

	string.empty?                             # => false
	string.include? 'MY'                      # => true

This use of English punctuation to provide the programmer with information is an example of Matz's design philosophy: that Ruby is a language primarily for humans to read and write, and secondarily for computers to interpret.

An interactive Ruby session is an indispensable tool for learning and experimenting with these methods. Again, we encourage you to type the sample code shown in these recipes into an irb or fxri session, and try to build upon the examples as your knowledge of Ruby grows.

Here are some extra resources for using strings in Ruby:

  • You can get information about any built-in Ruby method with the ri command; for instance, to see more about the String#upcase! method, issue the command ri "String#upcase!" from the command line.

  • "why the lucky stiff" has written an excellent introduction to installing Ruby, and using irb and ri:

  • For more information about the design philosophy behind Ruby, read an interview with Yukihiro "Matz" Matsumoto, creator of Ruby:

1.1. Building a String from Parts


You want to iterate over a data structure, building a string from it as you do.


There are two efficient solutions. The simplest solution is to start with an empty string, and repeatedly append substrings onto it with the << operator:

	hash = { "key1" => "val1", "key2" => "val2" }
	string = ""
	hash.each { |k,v| string << "#{k} is #{v}\n" }
	puts string
	# key1 is val1
	# key2 is val2

This variant of the simple solution is slightly more efficient, but harder to read:

	string = ""
	hash.each { |k,v| string << k << " is " << v << "\n" }

If your data structure is an array, or easily transformed into an array, it's usually more efficient to use Array#join:

	puts hash.keys.join("\n") + "\n"
	# key1
	# key2


In languages like Python and Java, it's very inefficient to build a string by starting with an empty string and adding each substring onto the end. In those languages, strings are immutable, so adding one string to another builds an entirely new string. Doing this multiple times creates a huge number of intermediary strings, each of which is only used as a stepping stone to the next string. This wastes time and memory.

In those languages, the most efficient way to build a string is always to put the substrings into an array or another mutable data structure, one that expands dynamically rather than by implicitly creating entirely new objects. Once you're done processing the substrings, you get a single string with the equivalent of Ruby's Array#join. In Java, this is the purpose of the StringBuffer class.

In Ruby, though, strings are just as mutable as arrays. Just like arrays, they can expand as needed, without using much time or memory. The fastest solution to this problem in Ruby is usually to forgo a holding array and tack the substrings directly onto a base string. Sometimes using Array#join is faster, but it's usually pretty close, and the << construction is generally easier to understand.

If efficiency is important to you, don't build a new string when you can append items onto an existing string. Constructs like str << 'a' + 'b' or str << "#{var1} #{var2}" create new strings that are immediately subsumed into the larger string. This is exactly what you're trying to avoid. Use str << var1 <<''<< var2 instead.

On the other hand, you shouldn't modify strings that aren't yours. Sometimes safety requires that you create a new string. When you define a method that takes a string as an argument, you shouldn't modify that string by appending other strings onto it, unless that's really the point of the method (and unless the method's name ends in an exclamation point, so that callers know it modifies objects in place).

Another caveat: Array#join does not work precisely the same way as repeated appends to a string. Array#join accepts a separator string that it inserts between every two elements of the array. Unlike a simple string- building iteration over an array, it will not insert the separator string after the last element in the array. This example illustrates the difference:

	data = ['1', '2', '3']
	s = ''
	data.each { |x| s << x << ' and a '}
	s                                             # => "1 and a 2 and a 3 and a "
	data.join(' and a ')                          # => "1 and a 2 and a 3"

To simulate the behavior of Array#join across an iteration, you can use Enumerable#each_with_index and omit the separator on the last index. This only works if you know how long the Enumerable is going to be:

	s = ""
	data.each_with_index { |x, i| s << x; s << "|" if i < data.length-1 }
	s                                             # => "1|2|3"

[1] Yes, this was covered in the Preface, but not everyone reads the Preface.

[2] "\xc3\xa9" is a Ruby string representation of the UTF-8 encoding of the Unicode character é.

The best content for your career. Discover unlimited learning on demand for around $1/day.