5

Classification Models in VisMiner

Classification is a form of prediction modeling that uses selected input attribute values to predict a nominal or categorical output value. In constructing a classification model, a dataset is used that contains historical data from past events in which the values of both the input and output attributes are known. The classification methodology uses those values to construct a model that best fits the data – that is the model accurately predicts the output category based on input values. The process of model construction is sometimes referred to as training. Once constructed and validated, the model can be used in the future to predict the category when the input attribute values are known, but the value of the output attribute is not yet known. For example, an insurance company may want to build a classification model to predict if an insurance claim is likely to be fraudulent or legitimate.

This chapter introduces the functionality of three modelers or methodologies for classification as they are implemented in VisMiner: decision trees, artificial neural networks, and support vector machines.

Dataset Preparation

The dataset used for classification modeling in VisMiner must be in a tabular format. The input attributes may be of any data type – numeric, ordinal, or nominal. The output attribute must be nominal or discrete (integer) numeric.

It is important to remember that when using VisMiner to build a classification model, the dataset should ...

Get Visual Data Mining: The VisMiner Approach, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.