HyperLogLogs

The newest Redis data type is a probabilistic data structure that provides an estimated count of unique items in a collection. Under typical or normal situations, to get a unique count of a collection's items requires an amount of memory that is equal to the number of items or at least a time complexity of O(n). Why? To ensure that no items are double-counted if they are duplicated in the collection, the algorithm must keep a record of each item for comparison with any new items. This amount of overhead becomes quite large and expensive to calculate as the size of the collections increases in the order of millions of items. In contrast, storing unique elements in a HyperLogLog structure computes and stores an estimate of the size ...

Get Mastering Redis now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.