As the number of negative examples, that is, instances of fraud, is very small compared to positive examples, the learning algorithms struggle with induction. We can help them by giving them a dataset where the share of positive and negative examples is comparable. This can be achieved with dataset rebalancing.
Weka has a built-in filter, Resample, which produces a random subsample of a dataset, using sampling either with replacement or without replacement. The filter can also bias the distribution toward a uniform class distribution.
We will proceed by manually implementing k-fold cross-validation. First, we will split the dataset into k equal folds. Fold k will be used for testing, while the other folds will be used ...