EXERCISES

4.1 “Duplicates in databases are a good idea in because duplicates occur naturally in the real world. For example, all pennies are duplicates of one another.” How would you respond to this argument?

4.2 Let r be a relation and let bx and by be boolean expressions. Then there’s a law (used in relational systems to help with optimization, among other things) that states that (r WHERE bx) UNION (r WHERE by) ≡ r WHERE bx OR by. If r isn’t a relation but an SQL table with duplicates, does this law still apply?

4.3 Let a, b, and c be sets. Then the distributive law of intersection over union (also used in relational systems to help with optimization among other things) states that a INTERSECT (b UNION c) ≡ (a INTERSECT b) UNION (a INTERSECT c). If a, b, and c are bags instead of sets, does this law still apply?

4.4 Part of the SQL standard’s explanation of the FROM clause (as in a SELECT - FROM - WHERE expression) reads as follows:

[The] result of the <from clause> is the ... cartesian product of the tables identified by [the specifications in that <from clause>]. The ... cartesian product, CP, is the multiset of all rows r such that r is the concatenation of a row from each of the identified tables ...

Note, therefore, that CP isn’t well defined!—the fact that the standard goes on to say that “The cardinality of CP is the product of the cardinalities of the identified tables” notwithstanding. For example, consider the tables T1 and T2 shown here:

image with no caption

Observe now that all of the following fit the above definition for “the” cartesian product CP of T1 and T2 (that is, any of them could be “the” multiset referred to):

image with no caption

Can you fix up the wording of the standard appropriately?

4.5 Consider the following SQL cursor definition:

     DECLARE X CURSOR FOR SELECT SNO , QTY FROM SP ;

Note that (a) cursor X permits updates, (b) the table visible through cursor X permits duplicates, but (c) the underlying table SP doesn’t (permit duplicates, that is). Now suppose the operation DELETE ... WHERE CURRENT OF X is executed. Then there’s no way, in general, of saying which specific row of table SP is deleted by that operation. How would you fix this problem?

4.6 Please write out one googol times: There’s no such thing as a duplicate. Note: A googol is one followed by 100 zeros (i.e., 10 to the hundredth power). A googolplex is one followed by a googol zeros (i.e., 10 to the “googolth” power).

4.7 Do you think nulls occur naturally in the real world?

4.8 There’s a logical difference between null and the third truth value: True or false? (Perhaps I should ask: True, false, or unknown?)

4.9 In the body of the chapter, I gave truth tables for one monadic 3VL connective (NOT) and two dyadic 3VL connectives (AND and OR), but there are many other connectives as well (see Exercise 4.10 below). Another useful monadic connective is MAYBE,[58] with truth table as follows:

P

MAYBE p

T

F

U

T

F

F

Does SQL support this connective?

4.10 Following on from the previous exercise, how many distinct connectives are there altogether in 2VL? What about 3VL? What do you conclude from your answers to these questions?

4.11 A logic is truth functionally complete if it supports, directly or indirectly, all possible connectives. Truth functional completeness is an extremely important property; a logic that didn’t satisfy it would be like an arithmetic that had no support for certain operations, say “+”. Is classical 2VL truth functionally complete? Is SQL’s 3VL truth functionally complete?

4.12 Let bx be a boolean expression. Then bx OR NOT bx is also a boolean expression, and in 2VL it’s guaranteed to evaluate to TRUE (it’s an example of what logicians call a tautology). Is it a tautology in 3VL? If not, is there an analogous tautology in 3VL?

4.13 With bx as in the previous exercise, bx AND NOT bx is also a boolean expression, and in 2VL it’s guaranteed to evaluate to FALSE (it’s an example of what logicians call a contradiction). Is it a contradiction in 3VL? If not, is there an analogous contradiction in 3VL?

4.14 In 2VL, r JOIN r is equal to r and INTERSECT and TIMES are both special cases of JOIN (see Chapter 6). Are these observations still valid in 3VL?

4.15 The following is a legitimate SQL row value constructor invocation: ROW (1,NULL). Is the row it denotes null or nonnull?

4.16 Let bx be an SQL boolean expression. Then NOT (bx) and (bx) IS NOT TRUE are both SQL boolean expressions. Are they equivalent?

4.17 Let x be an SQL expression. Then x IS NOT NULL and NOT (x IS NULL) are both SQL boolean expressions. Are they equivalent?

4.18 Let DEPT and EMP be SQL tables; let DNO be a column in both; let ENO be a column in EMP; and consider the expression DEPT.DNO = EMP.DNO AND EMP.DNO = ‘D1’ (this expression might be part of the WHERE clause in some query, for example). Now, a “good” optimizer might very well transform this expression into DEPT.DNO = EMP.DNO AND EMP.DNO = ‘D1’ AND DEPT.DNO = ‘D1’, on the grounds that a = b and b = c together imply that a = c (see Exercise 6.13 in Chapter 6). But is this transformation valid? If not, why not? And what are the implications?

4.19 Suppose the suppliers-and-parts database permits nulls; in particular, suppose columns SP.SNO and SP.PNO permit nulls.[59] Here then is a query on that database, expressed for reasons beyond the scope of this chapter not in SQL but in a kind of pidgin form of relational calculus (see Chapter 10):

     S WHERE NOT EXISTS SP ( SP.SNO = S.SNO AND SP.PNO = 'P2' )

What does this query mean? And is the following formulation equivalent?

     S WHERE NOT ( S.SNO IN ( SP.SNO WHERE SP.PNO = 'P2' ) )

4.20 Let k1 and k2 be values of the same type. In SQL, then, what exactly do the following statements mean?

  1. k1 and k2 are “the same” for the purposes of a comparison in, e.g., a WHERE clause.

  2. k1 and k2 are “the same” for the purposes of key uniqueness.

  3. k1 and k2 are “the same” for the purposes of duplicate elimination.

4.21 In the body of the chapter, I said UNION ALL can generate duplicates. But what about INTERSECT ALL and EXCEPT ALL?

4.22 Are the recommendations “Always specify DISTINCT” and “Never specify ALL” duplicates of each other?

4.23 If TABLE_DEE corresponds to TRUE (or yes) and TABLE_DUM to FALSE (or no), then what corresponds to UNKNOWN (or maybe)?

4.24 The following quotes are taken from the SQL standard:[60]

  • “The data type boolean comprises the distinct truth values True and False. Unless prohibited by a NOT NULL constraint, the boolean data type also supports the truth value Unknown as the null value. This [standard] does not make a distinction between the null value of the boolean data type and the truth value Unknown ... [They] may be used interchangeably to mean exactly the same thing.”

  • “All boolean values and SQL truth values are comparable ... The value True is greater than the value False, and any comparison involving the null value or an Unknown truth value will return an Unknown result.”

Do you have any comments on these quotes? In particular, which if any of the following do you think are legal SQL expressions? And what do they return, if they’re legal?

  1. TRUE OR FALSE

  2. TRUE OR UNKNOWN

  3. TRUE OR NULL

  4. TRUE > FALSE

  5. TRUE > UNKNOWN

  6. TRUE > NULL

4.25 In his book Using the New DB2 (Morgan Kaufmann, 1996), in a section titled “A Brief History of SQL,” Don Chamberlin—who is widely acknowledged to be “the father of SQL”—has the following to say (I’m quoting the text more or less verbatim, except that I’ve added some italics):

During the early development of SQL ... some decisions were made that were ultimately to generate a great deal [of] controversy ... Chief among these were the decisions to support null values [sic] and to permit duplicate rows ... I will [briefly examine] the reasons for these decisions ... My purpose here is historical rather than persuasive ... I recognize that nulls and duplicates are religious topics, and I do not expect anyone to have a conversion experience after reading this chapter.

Do you agree with Chamberlin that nulls and duplicates are “religious topics”?



[58] Useful, that is, if we buy into the notion that 3VL as such is useful, which of course I don’t.

[59] If {SNO,PNO} is the primary key for shipments, then columns SP.SNO and SP.PNO couldn’t permit nulls without violating the entity integrity rule. So in case such a possibility bothers you (it doesn’t bother me, because I don’t believe in that rule anyway), let me change the example slightly; let me introduce another column, SPNO (shipment number), into the shipments table, and let me make {SPNO} the primary key. Then {SNO,PNO} will still be a key, but it won’t be the primary key, and the entity integrity rule therefore won’t apply. (Incidentally, the very fact that the entity integrity rule is supposed to apply only to primary keys, not to candidate keys in general, seems to me to be another reason to regard that rule with suspicion. Not to mention the fact that it’s also supposed to apply only to base tables, not to tables in general, which I think makes it more suspect still.)

[60] Note that the standard uses True, Unknown, and False in prose discussions but TRUE, UNKNOWN, and FALSE in its SQL grammar.

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.