O'Reilly logo
  • Jaromir Salamon thinks this is interesting:

Different weight schemes can be used for calculating tfidf; the preceding example uses log with base 10 to calculate idf.


Cover of R Machine Learning By Example


In the case mentioned in the text then tfidf = 3 * log_10(2/1) = 0.9010
Based on http://www.tfidf.com tfidf = 3/7 * log_e(2/1) = 0.2970

- tf(t,d) = (Number of times term t appears in a document) / (Total number of terms in the document)
- idf(t,D) = log_e(Total number of documents / Number of documents with term t in it)