Chapter 8. Hands-On Autoencoder

In this chapter, we will build applications using various versions of autoencoders, including undercomplete, overcomplete, sparse, denoising, and variational autoencoders.

To start, let’s return to the credit card fraud detection problem we introduced in Chapter 3. For this problem, we have 284,807 credit card transactions, of which only 492 are fraudulent. Using a supervised model, we achieved an average precision of 0.82, which is very impressive. We can find well over 80% of the fraud with an over 80% precision. Using an unsupervised model, we achieved an average precision of 0.69, which is very good considering we did not use labels. We can find over 75% of the fraud with an over 75% precision.

Let’s see how this same problem can be solved using an autoencoder, which is also an unsupervised algorithm but one that uses a neural network.

Data Preparation

Let’s first load the necessary libaries:

'''Main'''
import numpy as np
import pandas as pd
import os, time, re
import pickle, gzip

'''Data Viz'''
import matplotlib.pyplot as plt
import seaborn as sns
color = sns.color_palette()
import matplotlib as mpl

%matplotlib inline

'''Data Prep and Model Evaluation'''
from sklearn import preprocessing as pp
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import log_loss
from sklearn.metrics import precision_recall_curve, average_precision_score
from sklearn.metrics import roc_curve ...

Get Hands-On Unsupervised Learning Using Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.