- Create a new file and import the chosen packages:
import nltk.classify.utilfrom nltk.classify import NaiveBayesClassifierfrom nltk.corpus import movie_reviews
- Describe a function to extract features:
def collect_features(word_list): word = [] return dict ([(word, True) for word in word_list])
- Adopt movie reviews in NLTK as training data:
if __name__=='__main__': plus_filenum = movie_reviews.fileids('pos') minus_filenum = movie_reviews.fileids('neg')
- Divide the data into positive and negative reviews:
feature_pluspts = [(collect_features(movie_reviews.words(fileids=[f])),'Positive') for f in plus_filenum] feature_minuspts = [(collect_features(movie_reviews.words(fileids=[f])),'Negative') for f in minus_filenum]