Classifier accuracy

Now we need to test our classifier with a bigger test set; in this case, we will randomly select 100 subjects: 50 spam and 50 not spam. Finally, we will count how many times the classifier chose the correct category:

with open("test.csv") as f: 
    correct = 0 
    tests = csv.reader(f) 
    for subject in test: 
          clase = classifier(subject[0],w,c,t,tw) 
          if clase[1] =subject[1]: 
      correct += 1 
     print("Efficiency : {0} of 100".format(correct)) 

In this case, the Efficiency is 82 percent:

>>> Efficiency: 82 of 100

Tip

We can use an out of the box implementation of the Naive Bayes classifier, like the NaiveBayesClassifier function in the NLTK package for Python. NLTK provides a very powerful natural language toolkit and we can download it from http://nltk.org/ ...

Get Practical Data Analysis - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.