O'Reilly logo

Parallel R by Stephen Weston, Q. Ethan McCallum

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 9. New and Upcoming

A perfect world would let us stop time to research and write, since a technical book covers a moving target. We didn’t have such a luxury, so instead we set aside some space to pick up on some new arrivals.

This chapter mentions a few tools for which we could have provided more coverage, had we been willing to postpone the book’s release date. Think of this as a look into one possible future of R parallelism. Special thanks to our colleagues, reviewers, and friends who so kindly brought these to our attention.

doRedis

The foreach() function[62] executes an arbitrary R expression across an input. foreach()’s strength is that it can execute in parallel with the help of a supplied parallel backend. The doRedis package provides such a backend, using the Redis datastore[63] as a job queue.

doRedis can work locally to take advantage of multicore systems, and also farm tasks out to remote R instances (“workers”). It’s straightforward to add or remove workers at runtime—even in mid-job—to adapt to changing work conditions or speed up job processing. Similar to Hadoop, doRedis is fault-tolerant in that failed tasks are automatically resubmitted to their job queue.

doRedis supports Linux, Mac OS X, and Windows systems.

RevoScale R and RevoConnectR (RHadoop)

Revolution Analytics is a company that provides R tools, support, and training. They have two products of note.

First up is the ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required