Chapter 7. Support Vector Machines

In this chapter, we will set out to solve a common problem: determining whether customers are happy or not. We’ll approach this by understanding that happy customers generally say nice things while unhappy ones don’t. This is their sentiment.

There are an infinite amount of solutions to this problem, but this chapter will focus on just one that works well: support vector machines (SVMs). This algorithm uses decision boundaries to split data into multiple parts and operates well in higher dimensions due to feature transformation and ignoring distances between data points. We will discuss the normal testing methods we have laid out before, such as:

  • Cross-validation

  • Confusion matrix

  • Precision and recall

But we will also delve into a new way of improving models, known as feature transformation. In addition, we will discuss the possibilities of the following phenomena happening in a problem of sentiment analysis:

  • Entanglement

  • Unstable data

  • Correction cascade

  • Configuration debt

Customer Happiness as a Function of What They Say

Our online store has two sets of customers, happy and unhappy. The happy customers return to the site consistently and buy from the company, while the unhappy customers are either window shoppers or spendthrifts who don’t care about the company or who are spending their money elsewhere. Our goals are to determine whether customer happiness correlates with our bottom line, and, down the line, to monitor their ...

Get Thoughtful Machine Learning with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.