Tips to avoid common regression problems

First, we have to use prior studies and domain knowledge to figure out which features to include in regression. Check literature, reports, and previous studies on what kinds of features work and some reasonable variables for modeling your problem. Suppose that you have a large set of features with random data; it is highly likely that several features will be correlated to the target variable (even though the data is random).

We have to keep the model simple, in order to avoid overfitting. The Occam's razor principle states that you should select a model that best explains your data, with the least assumptions. In practice, the model can be as simple as having two to four predictor features.

Get Machine Learning in Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.