Google+ is a great source of clean textual data that you can mine, but it’s just one of many starting points. Since this chapter showcases Google technology, this section provides a brief overview of how to tap into your Gmail data so that you can mine the text of what may be many thousands of messages in your inbox. If you haven’t read Chapter 3 yet, recall that it’s devoted to mail analysis but focuses primarily on the structured data features of mail messages, such as the participants in mail threads. The techniques outlined in that chapter could be easily applied to Gmail messages, and vice versa.
In early 2010, Google announced OAuth access to IMAP and SMTP in Gmail. This was a significant announcement because it officially opened the door to “Gmail as a platform,” enabling third-party developers to build apps that can access your Gmail data without you needing to give them your username and password. This section won’t get into the particular nuances of how Xoauth, Google’s particular implementation of OAuth, works (see No, You Can’t Have My Password for a terse introduction to OAuth). Instead, it focuses on getting you up and running so that you can access your Gmail data, which involves just a few simple steps:
Select the “Enable IMAP” option under the “Forwarding and POP/IMAP” tab in your Gmail Account Settings.
Visit the Google
Mail Xoauth Tools wiki page, download the
xoauth.py command-line utility, and follow the instructions ...