Hash generation

The standard practice for most PL/Proxy implementations is to partition based on the hashtext of the field you're splitting on, which allows splitting among a number of nodes fairly without knowing the distribution of the dataset in advance. hashtext is a PostgreSQL-provided internal function that takes in any text input and generates an integer as output with a hash code. If you AND the result at the bit level, to only take the lowest few bits, this turns out to be a very quick and fair way to distribute load among multiple systems. For this to work, the number of partitions needs to be a power of 2 (2, 4, 8, 16, and so on) and then you bitwise AND against one less than that number. So for two partitions that's & 1, for four ...

Get PostgreSQL 10 High Performance now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.