SOME REMARKS ON TERMINOLOGY

You probably noticed right away, in that bullet list of relational issues in the previous section, that I used the formal terms relation, tuple (usually pronounced to rhyme with couple), and attribute. SQL doesn’t use these terms, of course—it uses the more “user friendly” terms table, row, and column instead. And I’m generally sympathetic to the idea of using more user friendly terms, if they can help make the ideas more palatable. In the case at hand, however, it seems to me that, regrettably, they don’t make the ideas more palatable; instead, they distort them, and in fact do the cause of genuine understanding a grave disservice. The truth is, a relation is not a table, a tuple is not a row, and an attribute is not a column. And while it might be acceptable to pretend otherwise in informal contexts—indeed, I often do so myself—I would argue that it’s acceptable only if we all understand that the more user friendly terms are just an approximation to the truth and fail overall to capture the essence of what’s really going on. To put it another way: If you do understand the true state of affairs, then judicious use of the user friendly terms can be a good idea; but in order to learn and appreciate that true state of affairs in the first place, you really do need to come to grips with the formal terms. In this book, therefore, I’ll tend to use those formal terms (at least when I’m talking about the relational model as opposed to SQL), and I’ll give precise definitions for them at the relevant juncture. In SQL contexts, by contrast, I’ll use SQL’s own terms.

And another point on terminology: Having said that SQL tries to simplify one set of terms, I must say too that it does its best to complicate another. I refer to its use of the terms operator, function, procedure, routine, and method, all of which denote essentially the same thing (with, perhaps, very minor differences). In this book I’ll use the term operator throughout; thus, for example, I’ll refer to “=” (equality comparison), “:=” (assignment), “+” (addition), DISTINCT, JOIN, SUM, GROUP BY (etc., etc.) all as operators specifically.

Talking of SQL, incidentally, let me remind you that (as stated in the preface) I use that term to mean the standard version of the language exclusively, except in a few places where the context demands otherwise.[4] However:

  • The standard’s use of terminology is sometimes not very apt. In such situations, I generally prefer to use terminology of my own. For example, I use the term table expression in place of the standard term query expression, for the following reasons among others: First, the value such expressions denote is indeed a table and not a query; second, queries aren’t the only context in which such expressions are used anyway. (As a matter of fact the standard does use the term table expression, but again it does so quite inappropriately; to be specific, it uses it to refer to what comes after the SELECT clause in a SELECT expression.)

  • Following on from the previous point, I should add that not all table expressions—in either my sense or the standard’s—are legal in SQL in all contexts where they might be expected to be. In particular, an explicit JOIN invocation, although it certainly does denote a table, can’t appear as a “stand alone” table expression (i.e., at the outermost level of nesting), nor can it appear as the table expression in parentheses that constitutes a subquery (see Chapter 12).[5] Please note that these remarks apply to many of the individual discussions in the body of the book; it would be very tedious to keep on repeating them, however, and I won’t. (They’re reflected in the BNF grammar in Chapter 12, however.)

  • I ignore aspects of the standard that might be regarded as a trifle esoteric—especially if they aren’t part of what the standard calls Core SQL or don’t have much to do with relational processing as such. Examples here include the so called analytic or window (OLAP) functions; dynamic SQL; temporary tables; and details of user defined types.

  • For reasons that aren’t important here, I use a style for comments that differs from that of the standard. To be specific, I show comments as text strings in italics, bracketed by “/*” and “*/” delimiters.

Be aware, however, that all SQL products include features that aren’t part of the standard per se. Row IDs provide a common example. My general advice regarding such features is: By all means use them if you want to—but not if they violate relational principles (after all, what I’m advocating is supposed to be a relational approach to SQL). For example, row IDs in particular are likely to violate either The Principle of Interchangeability (see Chapter 9) or The Information Principle (see Appendix A) or both; and if they do, then I certainly wouldn’t use them. But, here and everywhere, the overriding rule is: You can do what you like, so long as you know what you’re doing.



[4] The standard has been through several versions, or editions, over the years. The version current at the time of writing is SQL:2008 (a formal reference for which can be found in Appendix G); the previous version was SQL:2003, the one before that was SQL:1999, and the one before that was SQL:1992. Most of the SQL features discussed in this book were present in SQL:1992, and often in even earlier versions.

[5] These particular limitations were added in SQL:2003; they didn’t apply to SQL:1992, which is where explicit JOIN invocations were first introduced, nor to SQL:1999.

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.