DISTINCT

CQL has a construct that intrinsically removes duplicate partition key entries from a result set, using the DISTINCT keyword. It works in much the same way as its SQL counterpart:

SELECT DISTINCT last_name FROM customer; last_name-----------       Tam Washburne(2 rows)

The main difference of DISTINCT in CQL is that it only operates on partition keys and static columns.

The only time in which DISTINCT is useful is when running an unbound query. This can appear to run efficiently in small numbers (fewer than 100). Do remember that it still has to reach out to all the nodes in the cluster.

Get Mastering Apache Cassandra 3.x - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.