Looking for runs of numbers within vectors
The function is called rle, which stands for ‘run length encoding’ and is most easily understood with an example. Here is a vector of 150 random numbers from a Poisson distribution with mean 0.7:
(poisson<-rpois(150,0.7)) [1] 1 1 0 0 2 1 0 1 0 1 0 0 0 0 2 1 0 0 3 1 0 0 1 0 2 0 1 1 0 0 0 1 0 0 0 2 1 [38] 0 0 0 1 0 0 0 2 0 0 0 1 1 0 2 1 0 0 0 2 0 0 2 3 2 1 0 2 0 0 0 0 0 1 1 0 0 [75] 0 0 0 1 1 1 0 0 1 0 1 2 2 0 0 2 0 0 0 0 [112] 0 0 2 0 0 1 0 1 0 4 0 0 1 0 2 1 0 1 1 0 0 1 3 3 0 0 1 1 0 1 0 0 0 0 0 1 0 [149] 2 0
We can do our own run length encoding on the vector by eye: there is a run of two 1s, then a run of two 0s, then a single 2, then a single 1, then a single 0, and so on. So the run lengths are 2, 2, 1, 1, 1, 1, . . . . The values associated with these runs were 1, 0, 2, 1, 0, 1,. Here is the output from rle:
rle(poisson) Run Length Encoding lengths: int [1:93] 2 2 1 1 1 1 1 1 4 1... values : num [1:93] 1 0 2 1 0 1 0 1 0 2...
The object produced by rle is a list of two vectors: the lengths and the values. To find the longest run, and the value associated with that longest run, we use the indexed lists like this:
max(rle(poisson)[[1]])
[1] 7
So the longest run in this vector of numbers was 7. But 7 of what? We use which to find the location of the 7 in lengths, then apply this index to values to find the answer:
which(rle(poisson)[[1]]==7) [1] 55 rle(poisson)[[2]][55] [1] 0
So, not surprisingly given that the mean was just 0.7, ...
Get The R Book now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.