You are previewing Learning Ruby.

Learning Ruby

Cover of Learning Ruby by Michael Fitzgerald Published by O'Reilly Media, Inc.
  1. Learning Ruby
    1. SPECIAL OFFER: Upgrade this ebook with O’Reilly
    2. A Note Regarding Supplemental Files
    3. Preface
      1. Who Should Read This Book?
      2. How This Book Works
      3. About the Examples
      4. How This Book Is Organized
      5. Conventions Used in This Book
      6. Comments and Questions
      7. Safari® Enabled
      8. Acknowledgments
    4. 1. Ruby Basics
      1. Hello, Matz
      2. Interactive Ruby
      3. Resources
      4. Installing Ruby
      5. Permission Denied
      6. Associating File Types on Windows
      7. Review Questions
    5. 2. A Quick Tour of Ruby
      1. Ruby Is Object-Oriented
      2. Ruby's Reserved Words
      4. Variables
      5. Strings
      6. Numbers and Operators
      7. Conditional Statements
      8. Arrays and Hashes
      9. Methods
      10. Blocks
      11. Symbols
      12. Exception Handling
      13. Ruby Documentation
      14. Review Questions
    6. 3. Conditional Love
      1. The if Statement
      2. The case Statement
      3. The while Loop
      4. The loop Method
      5. The for loop
      6. Execution Before or After a Program
      7. Review Questions
    7. 4. Strings
      1. Creating Strings
      2. Concatenating Strings
      3. Accessing Strings
      4. Comparing Strings
      5. Manipulating Strings
      6. Case Conversion
      7. Managing Whitespace, etc.
      8. Incrementing Strings
      9. Converting Strings
      10. Regular Expressions
      11. 1.9 and Beyond
      12. Review Questions
    8. 5. Math
      1. Class Hierarchy and Included Modules
      2. Converting Numbers
      3. Basic Math Operations
      4. Ranges
      5. Inquiring About Numbers
      6. More Math Methods
      7. Math Functions
      8. Rational Numbers
      9. Prime Numbers
      10. Review Questions
    9. 6. Arrays
      1. Creating Arrays
      2. Accessing Elements
      3. Concatenation
      4. Set Operations
      5. Unique Elements
      6. Blow Your Stack
      7. Comparing Arrays
      8. Changing Elements
      9. Deleting Elements
      10. Arrays and Blocks
      11. Sorting Things and About Face
      12. Multidimensional Arrays
      13. 1.9 and Beyond
      14. Other Array Methods
      15. Review Questions
    10. 7. Hashes
      1. Creating Hashes
      2. Accessing Hashes
      3. Iterating over Hashes
      4. Changing Hashes
      5. Converting Hashes to Other Classes
      6. 1.9 and Beyond
      7. Other Hash Methods
      8. Review Questions
    11. 8. Working with Files
      1. Directories
      2. Creating a New File
      3. Opening an Existing File
      4. Deleting and Renaming Files
      5. File Inquiries
      6. Changing File Modes and Owner
      7. The IO Class
      8. Review Questions
    12. 9. Classes
      1. Defining the Class
      2. Instance Variables
      3. Accessors
      4. Class Variables
      5. Class Methods
      6. Inheritance
      7. Modules
      8. public, private, or protected
      9. Review Questions
    13. 10. More Fun with Ruby
      1. Formatting Output with sprintf
      2. Processing XML
      3. Date and Time
      4. Reflection
      5. Using Tk
      6. Metaprogramming
      7. RubyGems
      8. Exception Handling
      9. Creating Documentation with RDoc
      10. Embedded Ruby
      11. Review Questions
    14. 11. A Short Guide to Ruby on Rails
      1. Where Did Rails Come From
      2. Why Rails?
      3. What Have Other Folks Done with Rails?
      4. Hosting Rails
      5. Installing Rails
      6. Learning Rails
      7. A Brief Tutorial
      8. Review Questions
    15. A. Ruby Reference
      1. Ruby Interpreter
      2. Ruby's Reserved Words
      3. Operators
      4. Escape Characters
      5. Predefined Variables
      6. Global Constants
      7. Regular Expressions
      8. String Unpack Directives
      9. Array Pack Directives
      10. Sprintf Flags and Field Types
      11. File Tests
      12. Time Formatting Directives
      13. RDoc Options
      14. Rake
    16. B. Answers to Review Questions
      1. Chapter 1 Review Questions
      2. Chapter 2 Review Questions
      3. Chapter 3 Review Questions
      4. Chapter 4 Review Questions
      5. Chapter 5 Review Questions
      6. Chapter 6 Review Questions
      7. Chapter 7 Review Questions
      8. Chapter 8 Review Questions
      9. Chapter 9 Review Questions
      10. Chapter 10 Review Questions
      11. Chapter 11 Review Questions
    17. Glossary
    18. Index
    19. About the Author
    20. Colophon
    21. SPECIAL OFFER: Upgrade this ebook with O’Reilly
O'Reilly logo

Regular Expressions

You have already seen regular expressions in action. A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. The syntax for regular expressions was invented by mathematician Stephen Kleene in the 1950s.

I'll spend a little time demonstrating some patterns to search for strings. In this little discussion, you'll learn the fundamentals: how to use basic string patterns, square brackets, alternation, grouping, anchors, shortcuts, repetition operators, and braces. Table 4-1 lists the syntax for regular expressions in Ruby.

We need a little text to munch on. Here are the opening lines of Shakespeare's 29th sonnet:

opening = "When in disgrace with fortune and men's eyes\nI all alone beweep my
outcast state,\n"

Note that this string contains two lines, set off by the newline character \n.

You can match the first line just by using a word in the pattern:

opening.grep(/men/) # => ["When in disgrace with fortune and men's eyes\n"]

By the way, grep is not a String method; it comes from the Enumerable module, which the String class includes, so it is available for processing strings. grep takes a pattern as an argument, and can also take a block (see

When you use a pair of square brackets ([]), you can match any character in the brackets. Let's try to match the word man or men using []:

opening.grep(/m[ae]n/) # => ["When in disgrace with fortune and men's eyes\n"]

It would also match a line with the word man in it:

Alternation lets you match alternate forms of a pattern using the pipe character (|):

opening.grep(/men|man/) # => ["When in disgrace with fortune and men's eyes\n"]

Grouping uses parentheses to group a subexpression, like this one that contains an alternation:

opening.grep(/m(e|a)n/) # => ["When in disgrace with fortune and men's eyes\n"]

Anchors anchor a pattern to the beginning (^) or end ($) of a line:

opening.grep(/^When in/) # => ["When in disgrace with fortune and men's eyes\n"]
opening.grep(/outcast state,$/) # => ["I all alone beweep my outcast state,\n"]

The ^ means that a match is found when the text When in is at the beginning of a line, and $ will only match outcast state if it is found at the end of a line.

One way to specify the beginning and ending of strings in a pattern is with shortcuts. Shortcut syntax is brief—a single character preceded by a backslash. For example, the \d shortcut represents a digit; it is the same as using [0-9] but, well, shorter. Similarly to ^, the shortcut \A matches the beginning of a string, not a line:

opening.grep(/\AWhen in/) # => ["When in disgrace with fortune and men's eyes\n"]

Similar to $, the shortcut \z matches the end of a string, not a line:

opening.grep(/outcast state,\z/) # => ["I all alone beweep my outcast state,"]

The shortcut \Z matches the end of a string before the newline character, assuming that a newline character (\n) is at the end of the string (it won't work otherwise).

Let's figure out how to match a phone number in the form (555)123-4567. Supposing that the string phone contains a phone number like this, the following pattern will find it:

phone.grep(/[\(\d\d\d\)]?\d\d\d-\d\d\d\d/) # => ["(555)123-4567"]

The backslash precedes the parentheses (\(...\)) to let the regexp engine know that these are literal characters. Otherwise, the engine will see the parentheses as enclosing a subexpression. The three \ds in the parentheses represent three digits. The hyphen (-) is just an unambiguous character, so you can use it in the pattern as is.

The question mark (?) is a repetition operator. It indicates zero or one occurrence of the previous pattern. So the phone number you are looking for can have an area code in parentheses, or not. The area-code pattern is surrounded by [ and ] so that the ? operator applies to the entire area code. Either form of the phone number, with or without the area code, will work. Here is a way to use ? with just a single character, u:

color.grep(/colou?r/) # => ["I think that colour is just right for you office."]

The plus sign (+) operator indicates one or more of the previous pattern, in this case digits:

phone.grep(/[\(\d+\)]?\d+-\d+/) # => ["(555)123-4567"]

Braces ({}) let you specify the exact number of digits, such as \d{3} or \d{4}:

phone.grep(/[\(\d{3}\)]?\d{3}-\d{4}/)# => ["(555)123-4567"]


It is also possible to indicate an "at least" amount with {m,}, and a minimum/maximum number with {m,n}.

The String class also has the =~ method and the !~ operator. If =~ finds a match, it returns the offset position where the match starts in the string:

color =~ /colou?r/ # => 13

The !~ operator returns true if it does not match the string, false otherwise:

color !~ /colou?r/ # => false

Also of interest are the Regexp and MatchData classes. The Regexp class ( lets you create a regular expression object. The MatchData class ( provides the special $- variable, which encapsulates all search results from a pattern match.

This discussion has given you a decent foundation in regular expressions (see Table 4-1 for a listing). With these fundamentals, you can define most any pattern.

Table 4-1. Regular expressions in Ruby




Pattern pattern in slashes, followed by optional options, i.e., one or more of: i for case-insensitive; o for substitute once; x for ignore whitespace, allow comments; m for match multiple lines, newlines as normal characters


General delimited string for a regular expression, where ! can be an arbitrary character


Matches beginning of line


Matches end of line


Matches any character


Matches nth grouped subexpression


Matches nth grouped subexpression, if already matched; otherwise, refers to octal representation of a character code

\n, \r, \t, etc.

Matches character in backslash notation


Matches word character, as in [0-9A-Za-z_]


Matches nonword character


Matches whitespace character, as in [\t\n\r\f]


Matches nonwhitespace character


Matches digit, same as [0-9]


Matches nondigit


Matches beginning of a string


Matches end of a string, or before newline at the end


Matches end of a string


Matches word boundary outside [], or backspace (0x08) inside []


Matches nonword boundary


Matches point where last match finished


Matches any single character in brackets, such as [ch]at


Matches any single character not in brackets


Matches 0 or more of previous regular expressions


Matches zero or more of previous regular expressions (nongreedy)


Matches one or more of previous regular expressions


Matches one or more of previous regular expressions (nongreedy)


Matches exactly m number of previous regular expressions


Matches at least m number of previous regular expressions


Matches at least m but at most n number of previous regular expressions


Matches at least m but at most n number of previous regular expressions (nongreedy)


Matches zero or one of previous regular expressions


Alternation, such as color|colour

( )

Grouping regular expressions or subexpression, such as col(o|ou)r




Grouping without back-references (without remembering matched text)


Specify position with pattern


Specify position with pattern negation


Matches independent pattern without backtracking


Toggles i, m, or x options on


Toggles i, m, or x options off


Toggles i, m, or x options on within parentheses


Toggles i, m, or x options off within parentheses

(?ix-ix: )

Turns on (or off) i and x options within this noncapturing group

The best content for your career. Discover unlimited learning on demand for around $1/day.