Using DISTINCT to Eliminate Duplicates

Problem

You want to know which values are present in a set of values, without displaying duplicate values multiple times. Or you want to know how many distinct values there are.

Solution

Use DISTINCT to select unique values or COUNT(DISTINCT) to count them.

Discussion

One summary operation that doesn’t use aggregate functions is to determine which values or rows are contained in a dataset by eliminating duplicates. Do this with DISTINCT (or DISTINCTROW, which is synonymous). DISTINCT is useful for boiling down a query result, and often is combined with ORDER BY to place the values in more meaningful order. For example, to determine the names of the drivers listed in the driver_log table, use the following statement:

mysql>SELECT DISTINCT name FROM driver_log ORDER BY name;
+-------+
| name  |
+-------+
| Ben   |
| Henry |
| Suzi  |
+-------+

A statement without DISTINCT produces the same names, but is not nearly as easy to understand, even with a small dataset:

mysql>SELECT name FROM driver_log;
+-------+
| name  |
+-------+
| Ben   |
| Suzi  |
| Henry |
| Henry |
| Ben   |
| Henry |
| Suzi  |
| Henry |
| Ben   |
| Henry |
+-------+

To determine how many different drivers there are, use COUNT(DISTINCT):

mysql>SELECT COUNT(DISTINCT name) FROM driver_log;
+----------------------+
| COUNT(DISTINCT name) |
+----------------------+
|                    3 |
+----------------------+

COUNT(DISTINCT) ignores NULL values. Should you wish to count NULL as one of the values in the set if it’s ...

Get MySQL Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.