Cover by Leonard Richardson, Lucas Carlson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

O'Reilly logo

4.7. Making Sure a Sorted Array Stays Sorted

Problem

You want to make sure an array stays sorted, even as you replace its elements or add new elements to it.

Solution

Subclass Array and override the methods that add items to the array. The new implementations add every new item to a position that maintains the sortedness of the array.

As you can see below, there are a lot of these methods. If you can guarantee that a particular method will never be called, you can get away with not overriding it.

	class 
SortedArray < Array

	  def initialize(*args, &sort_by)
	    @sort_by = sort_by || Proc.new { |x,y| x <=> y }
	    super(*args)
	    sort! &sort_by
	  end

	  def insert(i, v)
	    # The next line could be further optimized to perform a
	    # binary search.
	    insert_before = index(find { |x| @sort_by.call(x, v) == 1 })
	    super(insert_before ? insert_before : -1, v)
	  end

	  def <<(v)
	    insert(0, v)
	  end

	  alias push <<
	  alias unshift <<

Some methods, like collect!, can modify the items in an array, taking them out of sort order. Some methods, like flatten!, can add new elements to strange places in an array. Rather than figuring out a way to implement these methods in a way that preserves the sortedness of the array, we'll just let them run and then re-sort the array.[1]

	  ["collect!", "flatten!", "[]="].each do |method_name|
	    module_eval %{
	      def #{method_name}(*args)
	        super
	        sort! &@sort_by
	      end
	    }
	  end

	  def reverse!
	    #Do nothing; reversing the array would disorder it.
	  end
	end

A SortedArray created from an unsorted array will end up sorted:

	a = SortedArray.new([3,2,1])         # => [1, 2, 3]

Discussion

Many methods of Array are much faster on sorted arrays, so it's often useful to expend some overhead on keeping an array sorted over time. Removing items from a sorted array won't unsort it, but adding or modifying items can. Keeping a sorted array sorted means intercepting and reimplementing every sneaky way of putting objects into the array.

The SortedArray constructor accepts any code block you can pass into Array#sort, and keeps the array sorted according to that code block. The default code block uses the comparison operator (<=>) used by sort.

	unsorted= ["b", "aa", "a", "cccc", "1", "zzzzz", "k", "z"]
	strings_by_alpha = SortedArray.new(unsorted)
	# => ["1", "a", "aa", "b", "cccc", "k", "z", "zzzzz"]
	strings_by_length = SortedArray.new(unsorted) do |x,y|
	  x.length <=> y.length
	end
	# => ["b", "z", "a", "k", "1", "aa", "cccc", "zzzzz"]

The methods that add elements to an array specify where in the array they operate: push operates on the end of the array, and insert operates on a specified spot. SortedArray responds to these methods but it ignores the caller's request to put elements in a certain place. Every new element is inserted into a position that keeps the array sorted.

	a << -1                            # => [-1, 1, 2, 3]
	a << 1.5                           # => [-1, 1, 1.5, 2, 3]
	a.push(2.5)                        # => [-1, 1, 1.5, 2, 2.5, 3]
	a.unshift(1.6)                     # => [-1, 1, 1.5, 1.6, 2, 2.5, 3]

For methods like collect! and array assignment ([]=)that allow complex changes to an array, the simplest solution is to allow the changes to go through and then re-sort:

	a = SortedArray.new([10, 6, 4, -4, 200, 100])  
	# => [-4, 4, 6, 10, 100, 200]
	a.collect! { |x| x * -1 }          # => [-200, -100, -10, -6, -4, 4]

	a[3] = 25
	a                                  # => [-200, -100, -10, -4, 4, 25]
	# That is, -6 has been replaced by 25 and the array has been re-sorted.

	a[1..2] = [6000, 10, 600, 6]
	a                                  # => [-200, -4, 4, 6, 10, 25, 600, 6000]
	# That is, -100 and -10 have been replaced by 6000, 10, 600, and 6,
	# and the array has been re-sorted.

But with a little more work, we can write a more efficient implementation of array assignment that gives the same behavior. What happens when you run a command like a[0]= 10 on a SortedArray? The first element in the SortedArray is replaced by 10, and the SortedArray is re-sorted. This is equivalent to removing the first element in the array, then adding the value 10 to a place in the array that keeps it sorted.

Array#[]= implements three different types of array assignment, but all three can be modeled as a series of removals followed by a series of insertions. We can use this fact to implement a more efficient version of SortedArray#[]=:.

	class 
SortedArray
	  def []=(*args)
	    if args.size == 3
	      #e.g. "a[6,3] = [1,2,3]"
	      start, length, value = args
	      slice! Range.new(start, start+length, true)
	      (value.respond_to? :each) ? value.each { |x| self << x } : self << value
	    elsif args.size == 2
	      index, value = args
	      if index.is_a? Numeric
	        #e.g. "a[0] = 10" (the most common form of array assignment)
	        delete_at(index)
	        self << value
	      elsif index.is_a? Range
	        #e.g. "a[0..3] = [1,2,3]"
	        slice! index
	        (value.respond_to? :each) ? value.each { |x| self << x } : self << value
	      else
	        #Not supported. Delegate to superclass; will probably give an error.
	        super
	        sort!(&sort_by)
	      end
	    else
	      #Not supported. Delegate to superclass; will probably give an error.
	      super
	      sort!(&sort_by)
	    end
	  end
	end

Just as before, the sort will be maintained even when you use array assignment to replace some of a SortedArray's elements with other objects. But this implementation doesn't have to re-sort the array every time.

	a = SortedArray.new([1,2,3,4,5,6])
	a[0] = 10
	a                                     # => [2, 3, 4, 5, 6, 10]

	a[0, 2] = [100, 200]
	a                                     # => [4, 5, 6, 10, 100, 200]

	a[1..2] = [-4, 6]
	a                                     # => [-4, 4, 6, 10, 100, 200]

It's possible to subvert the sortedness of a SortedArray by modifying an object in place in a way that changes its sort order. Since the SortedArray never hears about the change to this object, it has no way of updating itself to move that object to its new sort position:[2]

	stripes = 
SortedArray.new(["aardwolf", "zebrafish"])
	stripes[1].upcase!  
	stripes                                 # => ["aardwolf", "ZEBRAFISH"]
	stripes.sort!                           # => ["ZEBRAFISH", "aardwolf"]

If this bothers you, you can make a SortedArray keep frozen copies of objects instead of the objects themselves. This solution hurts performance and uses more memory, but it will also prevent objects from being modified after being put into the SortedArray. This code adds a convenience method to Object that makes a frozen copy of the object:

	class Object
	  def to_frozen
	    f = self
	    unless frozen?
	       begin  
	         f = dup.freeze
	       rescue TypeError  
	         #This object can't be duped (e.g. Fixnum); fortunately,
	         #it usually can't be modified either 
	       end
	    end
	    return f
	  end
	end

The FrozenCopySortedArray stores frozen copies of objects instead of the objects themselves:

	class FrozenCopySortedArray < SortedArray
	  def insert(i, v)  
	    insert_before = index(find { |x| x > v })
	    super(insert_before ? insert_before : -1, v.to_frozen)
	  end

	  ["initialize", "collect!", "flatten!"].each do |method_name|
	    define_method(method_name) do 
	      super
	      each_with_index { |x, i| self[i] = x.to_frozen }
	      # No need to sort; by doing an assignment to every element
	      # in the array, we've made #insert keep the array sorted.
	    end
	  end
	end

	stripes = SortedArray.new(["aardwolf", "zebrafish"])
	stripes[1].upcase!  
	# TypeError: can't modify frozen string

Unlike a regular array, which can have elements of arbitrarily different data classes, all the elements of a SortedArray must be mutually comparable. For instance, you can mix integers and floating-point numbers within a SortedArray, but you can't mix integers and strings. Any data set that would cause Array# sort to fail makes an invalid SortedArray:

	[1, "string"].sort
	# ArgumentError: comparison of Fixnum with String failed

	a = SortedArray.new([1]) 
	a << "string" 
	# ArgumentError: comparison of Fixnum with String failed

One other pitfall: operations that create a new object, such as |=, +=, and to_a will turn an SortedArray into a (possibly unsorted) array.

	a = SortedArray.new([3, 2, 1])       # => [1, 2, 3]
	a += [1, -10]                        # => [1, 2, 3, 1, -10]
	a.class                              # => Array

The simplest way to avoid this is to override these methods to transform the resulting array back into a SortedArray:

	class SortedArray  
	  def + (other_array) 
	    SortedArray.new(super)
	  end
	end

See Also



[1] We can't use define_method to define these methods because in Ruby 1.8 you can't use define_method to create a method that takes a block argument. See Chapter 10 for more on this.

[2] One alternative is to modify SortedArray[] so that when you look up an element of the array, you actually get a delegate object that intercepts all of the element's method calls, and re-sorts the array whenever the user calls a method that modifies the element in place. This is probably overkill.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required