Pre-processing data using tokenization

The pre-processing of data involves converting the existing text into acceptable information for the learning algorithm.

Tokenization is the process of dividing text into a set of meaningful pieces. These pieces are called tokens.

Get Raspberry Pi 3 Cookbook for Python Programmers - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.