O'Reilly logo

Parallel R by Stephen Weston, Q. Ethan McCallum

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 8. Segue

Welcome to the last of the book’s recipes for R parallelism. This will be a short chapter, but don’t let that fool you: Segue’s scope is intentionally narrow. This focus makes it a particularly powerful tool.

Segue’s mission is as simple as it gets: make it easy to use Elastic MapReduce as a parallel backend for lapply()-style operations. So easy, in fact, that it boasts of doing this in only two lines of R code.[59]

This narrow focus is no accident. Segue’s creator, JD Long, wanted occasional access to a Hadoop cluster to run his pleasantly parallel,[60] computationally expensive models. Elastic MapReduce was a great fit but still a bit cumbersome for his workflow. He created Segue to tackle the grunt work so he could focus on his higher-level modeling tasks.

Segue is a relatively young package. Nonetheless, since its creation in 2010, it has attracted a fair amount of attention.

Quick Look

Motivation: You want Hadoop power to drive some lapply() loops, perhaps for a parameter sweep, but you want minimal Hadoop contact. You consider MapReduce to be too much of a distraction from your work.

Solution: Use the segue package’s emrlapply() to send your calculations up to Elastic MapReduce, the Amazon Web Services cloud-based Hadoop product.

Good because: You get to focus on your modelling work, while segue takes care of transforming your lapply() work into a Hadoop job.

How It Works

Segue takes care of launching the Elastic MapReduce cluster, shipping data back and forth, and ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required