AVOIDING NULLS IN SQL

The relational model prohibits nulls; to use SQL relationally, therefore, steps must be taken to prevent them from occurring. First of all, a NOT NULL constraint should be specified, explicitly or implicitly, for every column in every base table (see Chapter 5); then nulls will never occur in base tables as such. Unfortunately, however, certain SQL expressions can still yield result tables containing nulls. Here are some of the situations in which nulls can be produced:

  • The SQL “set functions” such as SUM all return null if their argument is empty (except for COUNT and COUNT(*), which correctly return zero in such a situation).

  • If a scalar subquery evaluates to an empty table, that empty table is coerced to a null.

  • If a row subquery evaluates to an empty table, that empty table is coerced to a row of all nulls. Note: A row of all nulls and a null row aren’t the same thing at all, logically speaking (another logical difference here, in fact)—yet SQL does think they’re the same thing, at least some of the time. But it would take us much too far afield to get into the detailed implications of this state of affairs here.

  • Outer joins and “union joins” are expressly designed to produce nulls in their result.[57]

  • If the ELSE clause is omitted from a CASE expression, an ELSE clause of the form ELSE NULL is assumed.

  • The expression NULLIF(x,y) returns null if x = y evaluates to TRUE.

  • The “referential triggered actions” ON DELETE SET NULL and ON UPDATE SET NULL can both generate nulls (obviously enough).

    Strong recommendations:

  • Do specify NOT NULL, explicitly or implicitly, for every column in every base table.

  • Don’t use the keyword NULL in any other context whatsoever (i.e., anywhere other than a NOT NULL constraint or logical equivalent).

  • Don’t use the keyword UNKNOWN in any context whatsoever.

  • Don’t omit the ELSE clause from a CASE expression unless you’re certain it would never have been reached anyway.

  • Don’t use NULLIF.

  • Don’t use outer join, and don’t use the keywords OUTER, FULL, LEFT, and RIGHT (except possibly as suggested in the section A REMARK ON OUTER JOIN below).

  • Don’t use union join.

  • Don’t specify either PARTIAL or FULL on MATCH (they have meaning only when nulls are present). For similar reasons, don’t use the MATCH option on foreign key constraints, and don’t use IS DISTINCT FROM. (In the absence of nulls, the expression a IS DISTINCT FROM b is logically equivalent to the expression a <> b.)

  • Don’t use IS TRUE, IS NOT TRUE, IS FALSE, or IS NOT FALSE. The reason is that, if bx is a boolean expression, then the following logical equivalences fail to be valid only if nulls are present:

         bx IS TRUE       ≡  bx
         bx IS NOT TRUE   ≡  NOT bx
         bx IS FALSE      ≡  NOT bx
         bx IS NOT FALSE  ≡  bx

    In other words, IS TRUE and the rest are distractions at best, in the absence of nulls.

  • Finally, do use COALESCE on every scalar expression that might “evaluate to null” without it. (Apologies for the quotation marks, but the fact is that the phrase “evaluates to null” is a solecism.)

In case you’re not familiar with COALESCE, let me elaborate briefly on the last of these recommendations. Essentially, COALESCE is an operator that lets us replace a null by some nonnull value “as soon as it appears” (i.e., before it has a chance to do any significant damage). Here’s the definition: Let a, b, ..., c be scalar expressions. Then the expression COALESCE (a,b,...,c) returns null if its arguments are all null, or the value of its first nonnull argument otherwise. Of course, to use it “sensibly,” you do need to make sure at least one of a, b, ..., c is nonnull! Here’s a fairly realistic example:

     SELECT S.SNO , ( SELECT COALESCE ( SUM ( ALL SP.QTY ) , 0 )
                      FROM   SP
                      WHERE  SP.SNO = S.SNO ) AS TOTQ
     FROM   S

In this example, if the SUM invocation “evaluates to null”—which it will do in particular for any supplier that doesn’t have any matching shipments—then the COALESCE invocation will replace that null by a zero. (Incidentally, this example also illustrates a situation in which use of ALL instead of DISTINCT isn’t just acceptable but is logically required, though it might be implicit. See Chapter 7.) Given our usual sample data, therefore, the query produces the following result:

SNO

TOTQ

S1

1300

S2

700

S3

200

S4

900

S5

0



[57] SQL’s UNION JOIN operator, which was a flawed attempt to support an already flawed operator called outer union, was introduced in SQL:1992 and dropped in SQL:2003.

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.