Glossary

Accuracy. The accuracy reflects the number of times the model is correct.

Activation function. This is used within a neural network to transform the input level into an output signal.

Aggregation. A process where the data is presented in a summary form, such as average.

Alternative hypothesis. Within a hypothesis test, the alternative hypothesis (or research hypothesis) states specific values of the population that are possible when the null hypothesis is rejected.

Antecedent. An antecedent is the statement or statements in the IF-part of a rule.

Applying predictive models. Once a predictive model has been built, the model can be used or applied to a data set to predict a response variable.

Artificial neural network. See neural network.

Associative rules. Associative rules (or association rules) result from data mining and present information in the form “if X then Y”.

Average. See mean.

Average linkage. Average linkage is the average distance between two clusters.

Backpropagation. A method for training a neural network by adjusting the weights using errors between the current prediction and the training set.

Bin. Usually created at the data preparation step, a variable is often broken up into a series of ranges or bins.

Binary variable. A variable with two possible outcomes: true (1) or false (0).

Binning. Process of breaking up a variable into a series of ranges.

Box plot. Also called a box-and-whisker plot, it is a way of graphically showing the median, quartiles and ...

Get Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.