9.3 ADVANCED DATA MINING

9.3.1 Overview

Some common applications of exploratory data analysis and data mining require special treatment. They all can make use of the techniques described in the book; however, there are a number of factors that should be considered and the data may need to be pre-analyzed prior to using it within the framework described in the book. The further reading section of this chapter contains links to additional resources on these subjects.

Table 9.11. Optimization of the neural network model

images

images

Figure 9.12. Summary of contents of false positives

9.3.2 Text Data Mining

A common application is data mining information contained in books, journals, web content, intranet content, and content on your desktop. One of the first barriers to using the data analysis and data mining techniques described in this book is the nontabular and textual format of documents. However, if the information can be translated into a tabular form then we can start to use the methods described on text documents. For example, a series of documents could be transformed into a data table as shown in Table 9.12. In this situation each row represents a different document. The columns represent all words contained in all documents. For each document, the presence of a word is indicated by “1” ...

Get Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.