To help you understand the role of the optimizer, consider the following query:
SELECT c.first_name,c.last_name,c.phone,p.name,p.price FROM customer c,orders o, product p WHERE c.id = o. customer_id AND o.product_id = p.id AND o.payment_status = 'FAILED' ORDER BY c.last_name,c.first_name
Figure 9-1. Parse tree for typical WHERE clause
We want to retrieve the first name, the last name, the phone number, and the product name and price for all the orders where payment has failed for one reason or another.
A naïve approach would loop through all of the records of
customer, and for each record of
customer loop through all of the records of
order, and then for each combination
of the two records loop through each record of
product. For each three-record combination,
the retrieval process would examine whether the combination matches the
WHERE clause, and keep only the
combinations that do. Afterward, the retrieval process would sort the
matched records and deliver them to the client.
You can see that this approach is not very efficient. Suppose each table has 10,000 records. The optimizer would have to examine 10,000 x 10,000 x 10,000 combinations, which is equal to 1 trillion. With a processor capable of examining1 million records per second, the query would take 1 million seconds, or more than 11 days.
On the other hand, suppose we have keys on
customer.id, orders.payment_status ...