BASE vs. DERIVED RELATIONS

As I explained earlier, the operators of the relational algebra allow us to start with some given relations, such as the ones depicted in Figure 1-3, and obtain further relations from those given ones (for example, by doing queries). The given relations are referred to as base relations, the others are derived relations. In order to get us started, therefore, a relational system has to provide a means for defining those base relations in the first place. In SQL, this task is performed by the CREATE TABLE statement (the SQL counterpart to a base relation being, of course, a base table, which is what CREATE TABLE creates). And base relations obviously need to be named—for example:

     CREATE TABLE S ... ;

But certain derived relations, including in particular what are called views, are named too. A view (also known as a virtual relation) is a named relation whose value at any given time t is the result of evaluating a certain relational expression at that time t. Here’s an SQL example:

     CREATE VIEW SST_PARIS AS
          ( SELECT SNO , STATUS
            FROM   S
            WHERE  CITY = 'Paris' ) ;

In principle, you can operate on views just as if they were base relations,[14] but they aren’t base relations. Instead, you can think of a view as being “materialized”—in effect, you can think of a base relation being constructed whose value is obtained by evaluating the specified relational expression—at the time the view in question is referenced. But I must emphasize that thinking of views being materialized in this way when they’re referenced is purely conceptual; it’s just a way of thinking; it’s not what’s really supposed to happen; and it wouldn’t work for update operations in any case. How views are really supposed to work is explained in Chapter 9.

By the way, there’s an important point I need to make here. You’ll often hear the difference between base relations and views described like this (warning! untruths coming up!):

  • Base relations really exist—that is, they’re physically stored in the database.

  • Views, by contrast, don’t “really exist”—they merely provide different ways of looking at the base relations.

But the relational model has nothing to say as to what’s physically stored!—in fact, it has nothing to say about physical storage matters at all. In particular, it categorically does not say that base relations are physically stored. The only requirement is that there must be some mapping between whatever is physically stored and those base relations, so that those base relations can somehow be obtained when they’re needed (conceptually, at any rate). If the base relations can be obtained from whatever’s physically stored, then everything else can be, too. For example, we might physically store the join of suppliers and shipments, instead of storing them separately; then base relations S and SP could be obtained, conceptually, by taking appropriate projections of that join. In other words: Base relations are no more (and no less!) “physical” than views are, so far as the relational model is concerned.

The fact that the relational model says nothing about physical storage is deliberate, of course. The idea was to give implementers the freedom to implement the model in whatever way they chose—in particular, in whatever way seemed likely to yield good performance—without compromising on physical data independence. The sad fact is, however, most SQL product vendors seem not to have understood this point (or not to have risen to the challenge, at any rate); instead, they map base tables fairly directly to physical storage,[15] and (as noted previously) their products therefore provide far less physical data independence than relational systems are or should be capable of. Indeed, this state of affairs is reflected in the SQL standard itself (as well as in most other SQL documentation), which typically—quite ubiquitously, in fact—talks in terms of “tables and views.” Clearly, anyone who talks this way is under the impression that tables and views are different things, and probably also that “tables” always means base tables specifically, and probably also that base tables are physically stored and views aren’t. But the whole point about a view is that it is a table (or, as I would prefer to say, a relation); that is, we can perform the same kinds of operations on views as we can on regular relations (at least in the relational model), because views are “regular relations.” Throughout this book, therefore, I’ll use the term relation to mean a relation (possibly a base relation, possibly a view, possibly a query result, and so on); and if I want to mean a base relation specifically, then I’ll say “base relation.” Recommendation: I suggest strongly that you adopt the same discipline for yourself. Don’t fall into the common trap of thinking the term relation means a base relation specifically—or, in SQL terms, thinking the term table means a base table specifically. Likewise, don’t fall into the common trap of thinking base relations (or base tables, in SQL) have to be physically stored.



[14] You might be thinking this claim can’t be 100 percent true for update operations. If so, you might be right as far as today’s SQL products are concerned; nevertheless, I still claim it’s true in principle. See the section UPDATE OPERATIONS in Chapter 9 for further discussion.

[15] I say this knowing full well that the majority of today’s SQL products do provide a variety of options for hashing, partitioning, indexing, clustering, and otherwise organizing the data as stored on the disk. Despite this state of affairs, I still consider the mapping from base tables to physical storage in those products to be fairly direct.

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.