Batch layer

When the lambda architecture was proposed, Hadoop was already widely adopted and regarded as a proven technology. Hadoop is a linearly scalable system and can easily churn through terabytes of data in a reasonable amount of time to calculate nearly anything from your dataset. Here, reasonable may mean a job that runs for a few hours in the middle of the night so that new views of your data are ready first thing in the morning.

Using our temperature monitoring analogy, a day's worth of data will require a new batch job run if we need to calculate the average temperature in the month or year. Also, imagine we wanted to calculate trends day by day, month by month, or year by year. Whenever a new batch of daily temperatures is completed, ...

Get Serverless Design Patterns and Best Practices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.