O'Reilly logo

Data Mining Models by David L. Olson

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

CHAPTER 9

Scalability

Previous chapters have focused on explaining models. A small dataset (involving loan application approval) was used for demonstration. Data mining, and especially big data, implies much larger datasets. Real-time data collection of key data is going to occur for all of the typical datasets we have presented. The major difference in operations is in scalability. R works with massive datasets and is completely scalable. KNIME and WEKA are potentially limited in the ability to deal with large sets of data.

In this chapter, we will demonstrate some data characteristics with R. The data mining process presented in Chapter 3 needs to consider the outcome (some data is predictive, like the proportion of income expended on groceries ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required