O'Reilly logo

Ruby Cookbook by Leonard Richardson, Lucas Carlson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 4. Arrays

Like all high-level languages, Ruby has built-in support for arrays, objects that contain ordered lists of other objects. You can use arrays (often in conjunction with hashes) to build and use complex data structures without having to define any custom classes.

An array in Ruby is an ordered list of elements. Each element is a reference to some object, the way a Ruby variable is a reference to some object. For convenience, throughout this book we usually talk about arrays as though the array elements were the actual objects, not references to the objects. Since Ruby (unlike languages like C) gives no way of manipulating object references directly, the distinction rarely matters.

The simplest way to create a new array is to put a comma-separated list of object references between square brackets. The object references can be predefined variables (my_var), anonymous objects created on the spot ('my string', 4.7, or MyClass.new), or expressions (a+b, object.method). A single array can contain references to objects of many different types:

	a1 = []                              # => []
	a2 = [1, 2, 3]                       # => [1, 2, 3]
	a3 = [1, 2, 3, 'a', 'b', 'c', nil]   # => [1, 2, 3, "a", "b", "c", nil]

	n1 = 4
	n2 = 6
	sum_and_difference = [n1, n2, n1+n2, n1-n2]
	# => [4, 6, 10, -2]

If your array contains only strings, you may find it simpler to build your array by enclosing the strings in the w{} syntax, separated by whitespace. This saves you from having to write all those quotes and comma:

	%w{1 2 3}                            # => ["1", "2", "3"]
	%w{The rat sat
	   on the mat}
	# => ["The", "rat", "sat", "on", "the", "mat"]

The << operator is the simplest way to add a value to an array. Ruby dynamically resizes arrays as elements are added and removed.

	a = [1, 2, 3]                  # => [1, 2, 3]
	a << 4.0                       # => [1, 2, 3, 4.0]
	a << 'five'                    # => [1, 2, 3, 4.0, "five"]

An array element can be any object reference, including a reference to another array. An array can even contain a reference to itself, though this is usually a bad idea, since it can send your code into infinite loops.

	a = [1,2,3]                      # => [1, 2, 3]
	a << [4, 5, 6]                   # => [1, 2, 3, [4, 5, 6]]
	a << a                           # => [1, 2, 3, [4, 5, 6], […]]

As in most other programming languages, the elements of an array are numbered with indexes starting from zero. An array element can be looked up by passing its index into the array index operator []. The first element of an array can be accessed with a[0], the second with a[1], and so on.

Negative indexes count from the end of the array: the last element of an array can be accessed with a[-1], the second-to-last with a[-2], and so on. See Recipe 4.13 for more ways of using the array indexing operator.

The size of an array is available through the Array#size method. Because the index numbering starts from zero, the index of the last element of an array is the size of the array, minus one.

	a = [1, 2, 3, [4, 5, 6]]
	a.size                               # => 4
	a << a                               # => [1, 2, 3, [4, 5, 6], […]]
	a.size                               # => 5

	a[0]                                 # => 1
	a[3]                                 # => [4, 5, 6]
	a[3][0]                              # => 4
	a[3].size                            # => 3

	a[-2]                                # => [4, 5, 6]
	a[-1]                                # => [1, 2, 3, [4, 5, 6], […]]
	a[a.size-1]                          # => [1, 2, 3, [4, 5, 6], […]]

	a[-1][-1]                            # => [1, 2, 3, [4, 5, 6], […]]
	a[-1][-1][-1]                        # => [1, 2, 3, [4, 5, 6], […]]

All languages with arrays have constructs for iterating over them (even if it's just a for loop). Languages like Java and Python have general iterator methods similar to Ruby's, but they're usually used for iterating over arrays. In Ruby, iterators are the standard way of traversing all data structures: array iterators are just their simplest manifestation.

Ruby's array iterators deserve special study because they're Ruby's simplest and most accessible iterator methods. If you come to Ruby from another language, you'll probably start off thinking of iterator methods as letting you treat aspects of a data structure "like an array." Recipe 4.1 covers the basic array iterator methods, including ones in the Enumerable module that you'll encounter over and over again in different contexts.

The Set class, included in Ruby's standard library, is a useful alternative to the Array class for many basic algorithms. A Ruby set models a mathematical set: sets are not ordered, and cannot contain more than one reference to the same object. For more about sets, see Recipes 4.14 and 4.15.

4.1. Iterating Over an Array

Problem

You want to perform some operation on each item in an array.

Solution

Iterate over the array with Enumerable#each. Put into a block the code you want to execute for each item in the array.

	[1, 2, 3, 4].each { |x| puts x }
	# 1
	# 2
	# 3
	# 4

If you want to produce a new array based on a transformation of some other array, use Enumerable#collect along with a block that takes one element and transforms it:

	[1, 2, 3, 4].collect { |x| x ** 2 }             # => [1, 4, 9, 16]

Discussion

Ruby supports for loops and the other iteration constructs found in most modern programming languages, but its prefered idiom is a code block fed to an method like each or collect.

Methods like each and collect are called generators or iterators: they iterate over a data structure, yielding one element at a time to whatever code block you've attached. Once your code block completes, they continue the iteration and yield the next item in the data structure (according to whatever definition of "next" the generator supports). These methods are covered in detail in Chapter 7.

In a method like each, the return value of the code block, if any, is ignored. Methods like collect take a more active role. After they yield an element of a data structure to a code block, they use the return value in some way. The collect method uses the return value of its attached block as an element in a new array.

Although commonly used in arrays, the collect method is actually defined in the Enumerable module, which the Array class includes. Many other Ruby classes (Hash and Range are just two) include the Enumerable methods; it's a sort of baseline for Ruby objects that provide iterators. Though Enumerable does not define the each method, it must be defined by any class that includes Enumerable, so you'll see that method a lot, too. This is covered in Recipe 9.4.

If you need to have the array indexes along with the array elements, use Enumerable#each_with_index.

	['a', 'b', 'c'].each_with_index do |item, index|
	  puts "At position #{index}: #{item}"
	end
	# At position 0: a
	# At position 1: b
	# At position 2: c

Ruby's Array class also defines several generators not seen in Enumerable . For instance , to iterate over a list in reverse order, use the reverse_each method:

	[1, 2, 3, 4]. 
reverse_each { |x| puts x }
	# 4
	# 3
	# 2
	# 1

Enumerable#collect has a destructive equivalent: Array# collect!, also known as Arary#map! (a helpful alias for Python programmers). This method acts just like collect, but instead of creating a new array to hold the return values of its calls to the code block, it replaces each item in the old array with the corresponding value from the code block. This saves memory and time, but it destroys the old array:

	array = ['a', 'b', 'c']
	array.collect! { |x| x.upcase }
	array                                # => ["A", "B", "C"]
	array.map! { |x| x.downcase }
	array                                # => ["a", "b", "c"]

If you need to skip certain elements of an array, you can use the iterator methods Range#step and Integer#upto instead of Array#each. These methods generate a sequence of numbers that you can use as successive indexes into an array.

	array = ['junk', 'junk', 'junk', 'val1', 'val2']
	3.upto(array.length-1) { |i| puts "Value #{array[i]}" }
	# Value val1
	# Value val2

	array = ['1', 'a', '2', 'b', '3', 'c']
	(0..array.length-1).step(2) do |i|
	  puts "Letter #{array[i]} is #{array[i+1]}"
	end
	# Letter 1 is a
	# Letter 2 is b
	# Letter 3 is c

Like most other programming languages, Ruby lets you define for, while, and until loops—but you shouldn't need them very often. The for construct is equivalent to each, whether it's applied to an array or a range:

	for element in ['a', 'b', 'c']
	  puts element
	end
	# a
	# b
	# c

	for element in (1..3)
	  puts element
	end
	# 1
	# 2
	# 3

The while and until constructs take a boolean expression and execute the loop while the expression is true (while)or until it becomes true (until). All three of the following code snippets generate the same output:

	array = ['cherry', 'strawberry', 'orange']

	for index in (0…array.length)
	  puts "At position #{index}: #{array[index]}"
	end

	index = 0
	while index < array.length
	  puts "At position #{index}: #{array[index]}"
	  index += 1
	end

	index = 0
	until index == array.length
	  puts "At position #{index}: #{array[index]}"
	  index += 1
	end

	# At position 0: cherry
	# At position 1: strawberry
	# At position 2: orange

These constructs don't make for very idiomatic Ruby. You should only need to use them when you're iterating over a data structure in a way that doesn't already have an iterator method (for instance, if you're traversing a custom tree structure). Even then, it's more idiomatic if you only use them to define your own iterator methods.

The following code is a hybrid of each and each_reverse. It switches back and forth between iterating from the beginning of an array and iterating from its end.

	array = [1,2,3,4,5]
	new_array = []
	front_index = 0

	back_index = array.length-1
	while front_index <= back_index
	  new_array << array[front_index]
	  front_index += 1
	  if front_index <= back_index
	   new_array << array[back_index]
	    back_index -= 1
	  end
	end
	new_array                            # => [1, 5, 2, 4, 3]

That code works, but it becomes reusable when defined as an iterator. Put it into the Array class, and it becomes a universally accessible way of doing iteration, the colleague of each and reverse_each:

	class Array
	 def each_from_both_sides
	    front_index = 0
	    back_index = self.length-1
	    while front_index <= back_index
	      yield self[front_index]
	      front_index += 1
	      if front_index <= back_index
	    yield self[back_index]
	        back_index -= 1
	      end
	    end
	  end
	end

	new_array = []
	[1,2,3,4,5].each_from_both_sides { |x| new_array << x }
	new_array                            # => [1, 5, 2, 4, 3]

This "burning the candle at both ends" behavior can also be defined as a collecttype method: one which constructs a new array out of multiple calls to the attached code block. The implementation below delegates the actual iteration to the each_ from_both_sides method defined above:

	class Array
	  def collect_from_both_sides
	    new_array = []
	    each_from_both_sides { |x| new_array << yield(x) }
	    return new_array
	  end
	end

	["ham", "eggs", "and"].collect_from_both_sides { |x| x.capitalize }
	# => ["Ham", "And", "Eggs"]

See Also

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required