Driver function

To initiate the entire process, some event needs to be triggered. Here, we'll do this manually. The driver function is responsible for setting up the whole job and invoking the mappers in parallel. We'll accomplish this using some straightforward techniques.

By their nature, MapReduce jobs are batch-oriented, meaning they start up, do their work, write the results somewhere, and finally shut down. As such, doing this on some schedule (whether it be hourly, nightly, or weekly) makes sense. If we were doing this for real where the input data was changing, it would be trivial to set up this driver function to run on a schedule.

As usual, the entry point for all our functions is the handler.py file, which I have not shown. The ...

Get Serverless Design Patterns and Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.