Porter stemming

Porter stemming is one form of the stemming algorithm that removes suffixes from base words or terms in the English dictionary. The whole purpose of Porter Stemmer is to improve the performance of the NLP model training exercise. It does so by removing suffixes from a word and bringing it to its base form. This way, the number of terms is reduced and the memory footprint and complexity of your term space is also minimized. Porter is not dictionary-based. It does not use any stem dictionary to identify suffixes that need to be removed. It is based on a set of generic rules. Some people see this as a drawback as its working is pretty straightforward and does not take care of the lower-level contextual nitty-gritty of English ...

Get Artificial Intelligence for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.