Naïve Bayes classifier

The naïve Bayes algorithm uses probabilistic learning to make predictions about classes. It is a generative model; it learns the join probability P(X|Y) and then generates conditional probability, using Bayes' theorem. The prefix naïve is attributed to this algorithm because the assumptions it makes about the data sound very naïve. The algorithm assumes that the features or predictor variables are all of equal importance and independent of each other. This assumption is rarely true for real-life data. For example, text classification is an area in which naïve Bayes shines, because some words would be more important in predicting the class than others, and some words would be more likely to occur together. In e-mail classification, ...

Get Learning Apache Mahout now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.