Pruning redundant rules

Among generated rules, we sometimes find repeated or redundant rules (for instance, one rule is the super rule of another rule). In this recipe, we will show how to prune (or remove) repeated or redundant rules.

Getting ready

In this recipe, one has to have completed the previous recipe by generating rules and having these stored in a variable named rules.

How to do it…

Perform the following steps to prune redundant rules:

  1. First, you need to identify the redundant rules:
    > rules.sorted = sort(rules, by="lift")
    > subset.matrix = is.subset(rules.sorted, rules.sorted)
    > subset.matrix[lower.tri(subset.matrix, diag=T)] = NA
    > redundant = colSums(subset.matrix, na.rm=T) >= 1
    
  2. You can then remove the redundant rules:
    > rules.pruned ...

Get R for Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.