Application

The application is the final component that is connected to master nodes to start interacting with data on this newly-built distributed Spark cluster. The application can be a job submit or even an interactive CLI shell of any language. You can start a PySpark shell and connect to an Apache Cassandra cluster by executing the following:

$SPARK_HOME/bin/pyspark \--packages com.datastax.spark:spark-cassandra-connector_2.11:2.3.0 \--master spark://127.0.0.1:7077 \--conf spark.driver.memory=<1/4th of memory in GB>g \--conf spark.executor.memory=<1/2th of memory in GB>g \--conf spark.driver.maxResultSize=<1/4th of memory in GB>g \--conf spark.cassandra.connection.host=<cassanra contact point> \--conf spark.cassandra.connection.local_dc=<dc ...

Get Mastering Apache Cassandra 3.x - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.