SEMIJOIN AND SEMIDIFFERENCE

Join is one of the most familiar of all of the relational operators. In practice, however, it turns out that queries that require the join operator at all often really require an extended form of that operator called semijoin (you might not have heard of semijoin before, but in fact it’s quite important). Here’s the definition:

Definition: The semijoin of relations r1 and r2 (in that order), r1 MATCHING r2, is equivalent to (r1 JOIN r2){H1}, where {H1} is the heading of r1.

In other words, r1 MATCHING r2 is the join of r1 and r2, projected back on the attributes of r1 (and so the heading of the result is the same as that of r1). Here’s an example (“Get suppliers who currently supply at least one part”):

image with no caption

Note that the expressions r1 MATCHING r2 and r2 MATCHING r1 aren’t equivalent, in general—the first returns some subset of r1, the second returns some subset of r2. Note too that we could replace IN by MATCH in the SQL version; interestingly, however, we can’t replace NOT IN by NOT MATCH in the semidifference analog (see below), because there’s no “NOT MATCH” operator in SQL.

Turning now to semidifference: If semijoin is in some ways more important than join, a similar remark applies here also, but with even more force—in practice, most queries that require difference at all really require semidifference.[92] Here’s the definition:

Definition: The semidifference ...

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.