EXPRESSION TRANSFORMATION

In this section, I want to take a slightly closer look at what the optimizer does. More specifically, I want to consider what’s involved in transforming some relational expression into another, logically equivalent, expression. Note: I mentioned this notion under the discussion of duplicates in Chapter 4, where I explained that such transformations are one of the things the optimizer does; in fact, such transformations constitute one of the two great ideas at the heart of relational optimization (the other, beyond the scope of this book, is the use of “database statistics” to do what’s called cost based optimizing).[88]

I’ll start with a trivial example. Consider the following Tutorial D expression (the query is “Get suppliers who supply part P2, together with the corresponding quantities,” and I’ll ignore the SQL analog for simplicity):

( ( S JOIN SP ) WHERE PNO = 'P2' ) { ALL BUT PNO }

Suppose there are 1,000 suppliers and 1,000,000 shipments, of which 500 are for part P2. If the expression were simply evaluated by brute force (as it were), without any optimization at all, the sequence of events would be:

  1. Join S and SP: This step involves reading the 1,000 supplier tuples; reading the 1,000,000 shipment tuples 1,000 times each, once for each of the 1,000 suppliers; constructing an intermediate result consisting of 1,000,000 tuples; and writing those 1,000,000 tuples back out to the disk. (I’m assuming for simplicity that tuples are physically stored as such, ...

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.