Querying an entire table

Almost as bad as building a table with multiple secondary indexes are use cases that require multi-key or unbound queries. Many developers just need to know exactly how many rows their 200 GB table contains, so they run an unbound query while selecting a count:

SELECT COUNT(*) FROM some_really_huge_table_that_will_timeout;

Cassandra is simply not designed to scan its entire contents and return them in a nicely-formatted report. Queries such as this on large tables will likely result in a timeout.

An application team may ask for the query request timeout to be relaxed to accommodate an occasional unbound query such as this. My advice, is to say NO. Writing queries within the constraints of the database is a responsibility ...

Get Mastering Apache Cassandra 3.x - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.