Natural Language Processing

Machine learning, or artificial intelligence, is based on data that can be structured or unstructured. Natural language processing (NLP) is an area of algorithms that is focused on processing unstructured data. This chapter is focused on unstructured data with a natural language text format. Organizations always have large corpuses of unstructured text data, either in the form of word documents, PDFs, email body, or web documents. With advances in technology, organizations have started relying on large volumes of text information. For example, a legal firm has lots of information in the form of bond papers, legal agreements, court orders, law documents, and so on. Such information assets are made up of textual ...

Get Artificial Intelligence for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.