O'Reilly logo

Implementing Splunk: Big Data Reporting and Development for Operational Intelligence by Vincent Bumgarner

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Calculating top for a large time frame

One common problem is to find the top contributors out of some huge set of unique values. For instance, if you want to know what IP addresses are using the most bandwidth in a given day or week, you may have to keep track of the total of request sizes across millions of unique hosts to definitively answer this question. When using summary indexes, this means storing millions of events in the summary index, quickly defeating the point of summary indexes.

Just to illustrate, let's look at a simple set of data:

Time

1.1.1.1

2.2.2.2

3.3.3.3

4.4.4.4

5.5.5.5

6.6.6.6

12:00

99

100

100

100

  

13:00

99

 

100

100

100

 

14:00

99

100

 

101

100

 

15:00

99

 

99

100

100

 

16:00

99

100

  

100

100

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required