In the reducer phase, all of the data produced from the mappers must, at some point, be read and stored in memory. In our example application, the reducer reads 14 separate files sequentially and builds up a mapping of (From, To) addresses with corresponding numbers. The number of unique combinations for this dataset is 311,209. That is, our final results file is a CSV with just over 311,000 lines for a total of 18.2 MB. As you can imagine, this is well within the boundaries of a single Lambda function; reading 14 files keeping approximately 18 MB of data in memory isn't beyond the abilities of an individual Lambda function.
Imagine a case where we are counting IP addresses from a large number of large log files along with some ...