Getting ready

Preparing for this recipe can be quite tricky. First, we will have to start a Spark server. At the time of writing this book, the conda packages for accessing Spark were quite immature. We will still use conda here, but we will not install any Spark packages from conda. Follow these steps to prepare the environment:

  1. Make sure that you have Java 8 installed. Be careful with the Java version, as an older version will not work, but a newer might also be problematic.
  2. Download Spark (https://spark.apache.org/downloads.html). This code was tested with version 2.3.2. Do not use an older version, although you might want to try a newer one.
  3. Unzip it and enter the directory. Run ./sbin/start-all.sh.
  4. With your browser, connect to http://localhost:8080 ...

Get Bioinformatics with Python Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.