Counting the most common words in tweets

In this example, we will develop a simple application that counts the number of occurrences of each word in positive tweets. First, we will split each tweet into words. Then we remove all the URLs (http://...) and twitter users (@...). Next, we will remove all the words with three or less characters (like the, why, she, him, and so on). Finally, the counting word frequencies. In the following code, we can see the JavaScript map function spliting words from tweets:

function(){    
    this.text.split(' ').forEach(         
         function(word){            
            var txt = word.toLowerCase();            
            if(!(/^@/).test(txt) &&                  
                  txt.length >= 3 &&               
               !(/^http/).test(txt)){                   
                   emit(txt,1)            
            }        
        }}

The input will look like this:

'text': '@SomeUsr After using LaTeX a lot ...

Get Practical Data Analysis - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.