Document classification with and without GloVe

In this example, we're going to use a somewhat famous text classification problem known as the 20 newsgroup problem (http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/news20.html). In this problem, we are given 19,997 documents, each belonging to a newsgroup. Our goal is to use the text of the post to predict which newsgroup the text belongs in. For the millennials among us, a newsgroup is sort of the precursor to Reddit (but it's probably closer to the great-great-great grandfather of Reddit). The topics covered in those newsgroups vary greatly and include such topics as politics, religion, and operating systems, all of which you should avoid discussing in polite company. These posts ...

Get Deep Learning Quick Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.