COLLATIONS IN SQL

SQL’s rules regarding type checking and coercion, in the case of character strings in particular, are (sadly) rather more complex than I’ve been pretending so far, and I need to elaborate somewhat. Actually it’s impossible in a book of this nature to do more than just scratch the surface of the matter, but the basic idea is this: Any given character string (a) consists of characters from one associated character set and (b) has one associated collation. A collation—also known as a collating sequence—is a rule that’s associated with a specific character set and governs the comparison of strings of characters from that character set. Let C be a collation for character set S, and let a and b be any two characters from S. Then C must be such that exactly one of the comparisons a < b, a = b, and a > b evaluates to TRUE and the other two to FALSE (under C). Note: In early versions of SQL there was just one character set, that character set had just one collation, and that collation was based on the numerical order of the binary codes used to represent the characters in that character set. But there’s no intrinsic reason why collating sequences should have to depend on internal coding schemes, and there are good practical reasons why they shouldn’t.

So much for the basic idea. However, there are complications. One arises from the fact that any given collation can have either PAD SPACE or NO PAD defined for it. Suppose the character strings ‘AB’ and ‘AB’ (note the trailing ...

Get SQL and Relational Theory, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.