- Start a new project in IntelliJ or in an IDE of your choice. Make sure that the necessary JAR files are included.
- Set up the package location where the program will reside:
package spark.ml.cookbook.chapter13
- Import the necessary packages:
import org.apache.log4j.{Level, Logger}import org.apache.spark.mllib.classification.StreamingLogisticRegressionWithSGDimport org.apache.spark.mllib.linalg.Vectorsimport org.apache.spark.mllib.regression.LabeledPointimport org.apache.spark.rdd.RDDimport org.apache.spark.sql.{Row, SparkSession}import org.apache.spark.streaming.{Seconds, StreamingContext}import scala.collection.mutable.Queue
- Create a SparkSession object as an entry point to the cluster and a StreamingContext:
val spark ...