View Updating and Relational Theory

Preface

This book is the third in a series. Its predecessors were as follows:

SQL and Relational Theory: How to Write Accurate SQL Code (2nd edition)
Database Design and Relational Theory: Normal Forms and All That Jazz

Both of these books were published by O’Reilly in 2012. The first was aimed at database practitioners of all kinds; it explained the principles of relational theory and used those principles as a basis for recommendations on how to use SQL as if it were a true relational language (a discipline I referred to in that book as “using SQL relationally”). The second was a little more specialized; it was aimed at database professionals with an interest in database design specifically, and it explained the theory of relational database design and showed why that theory was important. And this third book is more specialized too, inasmuch as it also focuses on one specific technical issue—but the issue in question is an extremely important one, one that gets to the heart of how relational database systems really ought to behave (as opposed to the way today’s commercial SQL systems actually do behave, for the most part). That issue is a theory of updating: a theory that, as the book’s title indicates, applies to the updating of views in particular but is actually more general, in that it applies to the updating of “base data” just as much as it does to the updating of views as such. Note: Despite this latter state of affairs, I decided to emphasize the updating of views as such in the book’s title because it seems to me that, while database practitioners in general believe they understand how updating works when the target is base data, they’re typically more than a little skeptical as to whether it really works, or can be made to work, when the target is a view. In fact, view updating as such is a surprisingly controversial topic—which was and is, of course, a strong reason for wanting to write this book in the first place.

With regard to those two earlier books, incidentally, I should probably apologize for the large number of references to them (especially the first one) in the present book. Now, most references in this book to other publications are given in full, as in this example:

David McGoveran: “Accessing and Updating Views and Relations in a Relational Database,” U.S. Patent No. 7,263,512 (August 28th, 2007)

In the case of those previous books of mine in particular, however, I’ll refer to them from this point forward by their abbreviated titles alone (viz., SQL and Relational Theory and Database Design and Relational Theory, respectively).

Aside: I’ve said I’ll be giving references to other publications in full, but actually there aren’t many such references anyway. Although numerous papers, articles, and other writings on view updating have appeared over the past 30 years or so, most of them—with the notable exception of certain publications by David McGoveran—advocate approaches that differ fairly drastically from the one described in the present book (see later in this preface for further discussion of this point). For the most part, therefore, I felt it inappropriate to reference them, except for an occasional citation here and there. If you’re interested in investigating some of those other approaches in more detail, you can find a short list of pertinent references in Chapter 10 of my book An Introduction to Database Systems (8th edition, Addison-Wesley, 2004). End of aside.

I should stress that I do assume throughout what follows that you’re familiar with much of what’s covered in the SQL and Relational Theory book in particular. For example, I certainly assume you know what relations, attributes, and tuples are. Now, I make no apology for this state of affairs, since the present book is aimed at database professionals and database professionals ought really to be familiar with most of what’s in that earlier book anyway. In order to make the present book a little more self-contained, however, I do offer in Chapter 2 a brief review of pertinent aspects of that earlier book. I also offer in Chapter 3 a more detailed summary of what views in particular are and how they’re supposed to work.

Who Should Read This Book

My target audience is database professionals, or more generally anyone interested in the relational model, relational technology, or relational systems in general. As already indicated, familiarity with the SQL and Relational Theory book would be a big help, but I believe the present book has fresh insights to offer regarding relational theory in general, with special reference to view updating in particular. Also, I think it’s worth pointing out that it might be possible to use the ideas contained herein to guide a “roll your own” implementation (of view updating, I mean), absent native support on the part of the pertinent DBMS.^[1] However, my dearest wish in this regard is that DBMS implementers in particular will read this book and will thereby be motivated to provide some native view update support in their own product. Note: I’d also like to mention that I have a live seminar available based on the material in this book. For further details, please go to the website www.justsql.co.uk/chris_date/chris_date.htm.

Structure of the Book

I’ve said I assume you know what relations, attributes, and tuples are; more specifically, I assume you know what views are, too, at least in general terms. Views were originally discussed (though not by that name) in Codd’s very first paper on the relational model:

E. F. Codd: “Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks,” IBM Research Report RJ599 (August 19th, 1969)

Now, the principal rationale for supporting views, as Codd himself foresaw in the paper just referenced, is that they provide the means by which—at least in principle—the important goal of logical data independence can be achieved. (The term logical data independence refers to the ability to change the logical design of a database without having to make corresponding changes in the way the database is perceived by users, thereby protecting investment in, among other things, existing user training and existing applications. See Chapter 3 for further discussion.) In other words, the primary raison d’être.for views is, precisely, the goal of logical data independence. But if we’re to achieve that goal in practice and not just in principle, then it’s clear that views have to be updatable.

So view updating is an important problem. As a consequence, it has been the focus of considerable attention for quite some time now (at least 35 years or so), in both commercial and academic environments, and several different approaches have been proposed—even implemented, in some cases. However, the approaches in question all fail to provide a truly satisfactory solution to the problem (not just in my opinion, but also in that of other writers, I hasten to add). In the case of today’s mainstream SQL products, for example, the view updating mechanisms are typically both:

Incomplete, meaning they fail entirely to support updates on certain theoretically updatable views, and also
Incorrect, meaning even the view updates they do support they implement incorrectly, at least in some cases

(Again, see Chapter 3 for further discussion of these points.) As for the research literature, it seems to me that the writings in question typically overlook certain important factors—factors that are crucial to a systematic, comprehensive, and correct solution to the problem. By contrast, the solution described in detail in this book is indeed, I believe, a “systematic, comprehensive, and correct” one. I also believe (though in this connection I must make it very clear that I’m not an implementer myself) that the proposed solution could be incorporated into a relational DBMS with comparatively modest conceptual extensions to the architecture of the system.

Aside: Note that I do carefully say “a relational DBMS” here. As will be seen, the proposed solution relies heavily on the ability to state integrity constraints declaratively (and on the ability of the DBMS to enforce them, of course). For my part, I regard such capabilities as a sine qua non of a truly relational system. As I’m sure you’re aware, however, most if not all of today’s SQL products are seriously deficient in this area. End of aside.

With the foregoing by way of preamble, let me now say something about the way the text is structured:

Chapter 1 provides a motivating example that illustrates in simple and familiar terms (actually SQL terms) the approach to view updating to be described in detail in subsequent chapters. In particular, it demonstrates that “updating is updating,” regardless of whether it’s a view or base data that’s being updated. That’s why, as I said earlier, the book is concerned with what might be called a theory of updating in general—a theory that does apply to views in particular, but applies to base data equally as much.
Next, as previously mentioned, Chapter 2 offers a brief review of pertinent aspects of relational theory. In particular, it emphasizes the nature of the database per se as “the one true variable” and hence as the proper target for all operations of an updating nature.
Chapter 3 then describes the view concept and related matters in detail. Of course, I’ve already said I assume you know what views are in general terms, but this chapter covers a lot of material you might not be so familiar with, material that’s essential to a proper understanding of subsequent chapters.
Chapters Chapter 4–13 then discuss, one by one, views based on a variety of familiar (and, in a few cases, possibly not so familiar) relational operators—restriction, projection, join, and so on. Chapter 4 in particular, on restriction views, also introduces by means of examples quite a lot of additional foundation material (in fact, the chapter is in some respects a continuation of Chapter 3). The chapter also gives some idea as to the plan to be followed in the next nine chapters.
Chapter 14 then investigates the question of combining operations (e.g., what’s involved in updating a join of two restrictions, or a union of two joins?), a question that raises some rather intriguing and possibly surprising issues.
Finally, Chapter 15 presents an approach to resolving certain ambiguities that might arise—or might be claimed to arise, at least—in connection with the scheme described in previous chapters.
There are also two appendixes. Appendix A goes into detail on certain aspects of the all important relational assignment operator. Appendix B contains definitions for purposes of reference of the various relational operators considered in detail in the body of the book.

Note: As the foregoing outline should be sufficient to suggest, the book is definitely meant to be read in sequence as written.

Technical Notes

There are a few further preliminary points I need to cover here. First of all, note that I follow the usual convention throughout this book in using the generic term update in lower case to refer to the INSERT, DELETE, and UPDATE operators considered collectively (as well as to what I just referred to as “the all important relational assignment operator”—see Chapter 2). When I want to refer to the UPDATE operator as such, I’ll set it in all upper case (“all caps”) as just shown. As for the INSERT and DELETE operators, however, where no ambiguity arises, it can be a little tedious always to set them in all caps—especially when they’re being used as qualifiers, as in, e.g., “INSERT rule” (“insert rule”?). I’ve therefore decided to use both forms in this book, letting context be my guide in any given situation (and I won’t pretend I’ve been all that consistent in this respect, either).

Second, please note that I use the term SQL to mean the standard version of that language specifically, not some proprietary dialect (barring explicit statements to the contrary). In particular, I follow the standard in assuming the pronunciation “ess cue ell,” not “sequel” (though this latter pronunciation is common in the field), thereby writing things like an SQL table, not a SQL table. Note: The SQL standard has been through several versions, or editions, over the years. The version current at the time of writing is SQL:2011. Here’s the formal reference:

International Organization for Standardization (ISO): Database Language SQL, Document ISO/IEC 9075:2011 (2011)

Third and last, I need to say something about my use of the term user; in particular, I need to explain what I mean by my frequent use of phrases such as “what the user sees” or “the user’s perception of the database.” In general, you can take the term user to refer to either an interactive user^[2] or an application programmer or both, as the context demands. As for “what the user sees” and similar phrases, what I’m referring to here is the fact that most users interact, not with the database in its entirety, but rather with some subset of that entire database, defined by what’s sometimes called a subschema. What’s more, thanks to the view mechanism, that subset can and often does involve some logical restructuring. In fact, we can (and I will) assume for simplicity, and without loss of generality, that the subset in question consists exclusively of views, even if some of the views in question are effectively identical to the base data from which they’re derived. Of course, to the user of that subset, that collection of views is the database! In other words, database is a relative term, in a sense. Thus, we can usefully, albeit somewhat loosely, define a database, at least for the purposes of this book, to be either a given collection of data—i.e., the given base data—or some specific subset, possibly restructured, of that given collection. Note: When I say “somewhat loosely” here, what I have in mind primarily is the fact that a database is more than just data as such—the pertinent integrity constraints need to be taken into account as well, as we’ll see in Chapters Chapter 2 and Chapter 3.

Acknowledgments

I’d like to begin by thanking my wife Lindy once again for her support throughout the production of this book, as well as all of its predecessors. I’d also like to thank my friends and colleagues Hugh Darwen, David Livingstone, and David McGoveran for their detailed and comprehensive reviews of earlier drafts of this book. Those reviewers and their reviews were all very helpful in different ways, but David McGoveran in particular deserves special thanks—first of all, for originally suggesting the basic idea on which the view updating approach described in this book is based; second, for communicating and collaborating with me on this topic many times over the past 20 years or so; and last but not least, for his extensive theoretical work in this area. David also went considerably beyond the call of duty in his review: He not only commented on the text as such, he actually compiled and sent me a series of short essays on various aspects of the subject matter. Those essays were extremely helpful to me in my task of rewriting, and I believe they’ve resulted in a greatly improved text. Of course, I haven’t incorporated all of his suggestions—I don’t believe any author ever does act on all of the comments he or she receives from reviewers! But I’ve tried to do justice to what seemed to me to be the most important and substantive of his comments. Of course, it goes without saying that, as always, any remaining errors are my responsibility.

C. J. Date

Healdsburg, California

2013

^[1]DBMS = database management system. Of course, there’s a difference between a DBMS and a database! Unfortunately, the industry very commonly uses the term database when it means either some commercial product, such as Oracle, or the particular copy of such a product that happens to be installed on some particular computer. I do not follow that usage in this book. The problem is, if you call the DBMS a database, what do you call the database?

^[2]But still someone who knows something about database issues, not a genuine “end user,” who might quite reasonably be totally ignorant of such matters.

Get View Updating and Relational Theory now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

View Updating and Relational Theory by C.J. Date

Preface

Who Should Read This Book

Structure of the Book

Technical Notes

Acknowledgments

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly