O'Reilly logo

Apache Spark for Data Science Cookbook by Padma Priya Chitturi

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Working with Spark's Python and Scala shells

This recipe explains the spark-shell and PySpark command-line interface tools from the Apache Spark project. Spark-shell is the Scala-based command line interface tool and PySpark is the Python-based command-line tool used to develop Spark interactive applications. They are already initialized with SparkContext, SQLContext, and HiveContext.

How to do it…

Both spark-shell and PySpark are available in the bin directory of SPARK_HOME, that is, SPARK_HOME/bin:

  1. Invoke spark-shell as follows:
     $SPARK_HOME/bin/spark-shell [Options] $SPARK_HOME/bin/spark-shell --master <master type> i.e., local, spark, yarn, mesos. $SPARK_HOME/bin/spark-shell --master spark://<sparkmasterHostName>:7077 Welcome to ____ __ / __/__ ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required