Cross column correlation

Cross column correlation can cause a wrong estimation of the number of rows as PostgreSQL assumes that each column is independent of other columns. In reality, there are a lot of examples where this is not true. For example, one could find patterns where the first and last names in certain cultures are correlated. Another example is the country and language preference of the users. To understand cross column correlation, let's create a table called users, as follows:

CREATE TABLE users (
  id serial primary key,
  name text,
  country text,
  language text
);
INSERT INTO users(name, country, language) SELECT generate_random_text(8), 'Germany', 'German' FROM generate_series(1, 10);
INSERT INTO users(name, country, language) ...

Get Learning PostgreSQL now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.