Summary

In this chapter, we learned how to generate frequent itemsets from a dataset using the Apriori algorithm. We then proposed association rules from these itemsets by describing their support and confidence. We used one additional check, an added value measure, to ensure that the proposed rules were interesting. We implemented all these concepts using a freely available dataset of Freecode open source projects and their tags. We calculated support for single tags, then generated doubletons and tripletons that met a minimum support threshold. For rules with one item on the right-hand side, we calculated confidence and added value for each. Finally, we looked closely at the rules that were generated and tried to figure out which ones were interesting, ...

Get Mastering Data Mining with Python – Find patterns hidden in your data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.