If the number of events per day is in millions or tens of millions, querying all events for that day can be extremely expensive. For that reason, it makes sense to do part of the work in smaller periods of time.
Using a summary index to store these interim values can sometimes be an overkill if those values are not needed for long. In the Calculating top for a large time frame section, we ended up storing thousands of values every few minutes. If we simply want to know the top 10 per day, this might be seen as a waste. To cut down on the noise in our summary index, we can use a CSV as cheap interim storage.
The steps are essentially to:
- Periodically query recent data and update the CSV
- Capture top ...