IN THIS CHAPTER
Doing basic R programming
Setting up your programming environment
Developing a regression model
Developing classification tree models
A book covering the major facets of predictive analysis isn't complete unless it covers the R programming language. Our goal is to get you up and running as quick as possible. That goal entails getting you started making predictions and experimenting with predictive analysis, using standard tools such as R and the algorithms data scientists and statisticians use to make predictive models.
So do you have to know how to program to create predictive models? We would answer, “probably not, but it surely helps.” Relax. We think you'll have fun learning R. Granted, this chapter is pretty high-level, so you may read it just to boost your understanding of how data scientists and statisticians use R.
In an enterprise environment, you'll most likely use commercial tools available from industry vendors. Getting familiar with a free, open-source, widely used, powerful tool like R prepares you to use the commercial tools with ease. By that point, you'll have gotten a big dose of the terminology, understand how to handle the data, and know all the steps of predictive modeling. After doing all those steps “by hand,” you'll be well prepared to use the commercial tools.
Open-source R has memory and computational limits for enterprise-level big data analysis. Even so, open-source R is more than ...