Anaconda for cluster management

Anaconda for cluster management provides resource management tools which allow users to easily create, provision and manage bare-metal or cloud-based clusters. It enables the management of conda environments on clusters and provides integration, configuration and setup management of Hadoop services. This can be installed alongside enterprise Hadoop distributions such as Cloudera CDH or Hortonworks HDP and this is used to manage conda packages and environments across a cluster.

Getting ready

To step through this recipe, you need Ubuntu 14.04 (Linux flavor) installed on the machine. Python comes pre-installed. python --version gives the version of Python installed. If the version is 2.6.x, upgrade it to Python 2.7 as ...

Get Apache Spark for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.