Book description
Powerful smart applications using deep learning algorithms to dominate numerical computing, deep learning, and functional programming.
About This Book
- Explore machine learning techniques with prominent open source Scala libraries such as Spark ML, H2O, MXNet, Zeppelin, and DeepLearning4j
- Solve real-world machine learning problems by delving complex numerical computing with Scala functional programming in a scalable and faster way
- Cover all key aspects such as collection, storing, processing, analyzing, and evaluation required to build and deploy machine models on computing clusters using Scala Play framework.
Who This Book Is For
If you want to leverage the power of both Scala and Spark to make sense of Big Data, then this book is for you. If you are well versed with machine learning concepts and wants to expand your knowledge by delving into the practical implementation using the power of Scala, then this book is what you need! Strong understanding of Scala Programming language is recommended. Basic familiarity with machine Learning techniques will be more helpful.
What You Will Learn
- Apply advanced regression techniques to boost the performance of predictive models
- Use different classification algorithms for business analytics
- Generate trading strategies for Bitcoin and stock trading using ensemble techniques
- Train Deep Neural Networks (DNN) using H2O and Spark ML
- Utilize NLP to build scalable machine learning models
- Learn how to apply reinforcement learning algorithms such as Q-learning for developing ML application
- Learn how to use autoencoders to develop a fraud detection application
- Implement LSTM and CNN models using DeepLearning4j and MXNet
In Detail
Machine learning has had a huge impact on academia and industry by turning data into actionable information. Scala has seen a steady rise in adoption over the past few years, especially in the fields of data science and analytics. This book is for data scientists, data engineers, and deep learning enthusiasts who have a background in complex numerical computing and want to know more hands-on machine learning application development.
If you're well versed in machine learning concepts and want to expand your knowledge by delving into the practical implementation of these concepts using the power of Scala, then this book is what you need! Through 11 end-to-end projects, you will be acquainted with popular machine learning libraries such as Spark ML, H2O, DeepLearning4j, and MXNet.
At the end, you will be able to use numerical computing and functional programming to carry out complex numerical tasks to develop, build, and deploy research or commercial projects in a production-ready environment.
Style and approach
Leverage the power of machine learning and deep learning in different domains, giving best practices and tips from a real world case studies and help you to avoid pitfalls and fallacies towards decision making based on predictive analytics with ML models.
Table of contents
- Preface
-
Analyzing Insurance Severity Claims
- Machine learning and learning workflow
- Hyperparameter tuning and cross-validation
- Analyzing and predicting insurance severity claims
- LR for predicting insurance severity claims
- GBT regressor for predicting insurance severity claims
- Boosting the performance using random forest regressor
- Comparative analysis and model deployment
- Summary
- Analyzing and Predicting Telecommunication Churn
- High Frequency Bitcoin Price Prediction from Historical and Live Data
- Population-Scale Clustering and Ethnicity Prediction
-
Topic Modeling - A Better Insight into Large-Scale Texts
- Topic modeling and text clustering
-
Topic modeling with Spark MLlib and Stanford NLP
-
Implementation
- Step 1 - Creating a Spark session
- Step 2 - Creating vocabulary and tokens count to train the LDA after text pre-processing
- Step 3 - Instantiate the LDA model before training
- Step 4 - Set the NLP optimizer
- Step 5 - Training the LDA model
- Step 6 - Prepare the topics of interest
- Step 7 - Topic modelling
- Step 8 - Measuring the likelihood of two documents
-
Implementation
- Other topic models versus the scalability of LDA
- Deploying the trained LDA model
- Summary
-
Developing Model-based Movie Recommendation Engines
- Recommendation system
-
Spark-based movie recommendation systems
- Item-based collaborative filtering for movie similarity
-
Model-based recommendation with Spark
- Data exploration
-
Movie recommendation using ALS
- Step 1 - Import packages, load, parse, and explore the movie and rating dataset
- Step 2 - Register both DataFrames as temp tables to make querying easier
- Step 3 - Explore and query for related statistics
- Step 4 - Prepare training and test rating data and check the counts
- Step 5 - Prepare the data for building the recommendation model using ALS
- Step 6 - Build an ALS user product matrix
- Step 7 - Making predictions
- Step 8 - Evaluating the model
- Selecting and deploying the best model
- Summary
- Options Trading Using Q-learning and Scala Play Framework
-
Clients Subscription Assessment for Bank Telemarketing using Deep Neural Networks
-
Client subscription assessment through telemarketing
- Dataset description
- Installing and getting started with Apache Zeppelin
-
Exploratory analysis of the dataset
- Label distribution
- Job distribution
- Marital distribution
- Education distribution
- Default distribution
- Housing distribution
- Loan distribution
- Contact distribution
- Month distribution
- Day distribution
- Previous outcome distribution
- Age feature
- Duration distribution
- Campaign distribution
- Pdays distribution
- Previous distribution
- emp_var_rate distributions
- cons_price_idx features
- cons_conf_idx distribution
- Euribor3m distribution
- nr_employed distribution
- Statistics of numeric features
- Implementing a client subscription assessment model
- Hyperparameter tuning and feature selection
- Summary
-
Client subscription assessment through telemarketing
-
Fraud Analytics Using Autoencoders and Anomaly Detection
- Outlier and anomaly detection
- Autoencoders and unsupervised learning
-
Developing a fraud analytics model
- Description of the dataset and using linear models
- Problem description
-
Preparing programming environment
- Step 1 - Loading required packages and libraries
- Step 2 - Creating a Spark session and importing implicits
- Step 3 - Loading and parsing input data
- Step 4 - Exploratory analysis of the input data
- Step 5 - Preparing the H2O DataFrame
- Step 6 - Unsupervised pre-training using autoencoder
- Step 7 - Dimensionality reduction with hidden layers
- Step 8 - Anomaly detection
- Step 9 - Pre-trained supervised model
- Step 10 - Model evaluation on the highly-imbalanced data
- Step 11 - Stopping the Spark session and H2O context
- Auxiliary classes and methods
- Hyperparameter tuning and feature selection
- Summary
-
Human Activity Recognition using Recurrent Neural Networks
- Working with RNNs
- Human activity recognition using the LSTM model
-
Implementing an LSTM model for HAR
- Step 1 - Importing necessary libraries and packages
- Step 2 - Creating MXNet context
- Step 3 - Loading and parsing the training and test set
- Step 4 - Exploratory analysis of the dataset
- Step 5 - Defining internal RNN structure and LSTM hyperparameters
- Step 6 - LSTM network construction
- Step 7 - Setting up an optimizer
- Step 8 - Training the LSTM network
- Step 9 - Evaluating the model
- Tuning LSTM hyperparameters and GRU
- Summary
- Image Classification using Convolutional Neural Networks
- Other Books You May Enjoy
Product information
- Title: Scala Machine Learning Projects
- Author(s):
- Release date: January 2018
- Publisher(s): Packt Publishing
- ISBN: 9781788479042
You might also like
book
Scala and Spark for Big Data Analytics
Harness the power of Scala to program Spark and analyze tonnes of data in the blink …
book
Scala:Applied Machine Learning
Leverage the power of Scala and master the art of building, improving, and validating scalable machine …
book
Scala Programming Projects
Discover unique features and powerful capabilities of Scala Programming as you build projects in a wide …
book
TensorFlow Machine Learning Cookbook
Explore machine learning concepts using the latest numerical computing library - TensorFlow - with the help …