Calculating a moving median

The median of a list of numbers has an equal number of values less than and greater than it. The naive approach of calculating the median is to simply sort the list and pick the middle number. However, on a very large dataset, such a computation would be inefficient.

Another approach of finding a moving median is to use a combination of a minheap and a maxheap to sort the values while running through the data. We can insert numbers in either heap as they are seen, and whenever needed, the median can be calculated by adjusting the heaps to be of equal or near equal size. When the heaps are of equal size, it is simple to find the middle number, which is the median.

Getting ready

Create a file, input.txt, with some numbers: ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.