Closing Remarks

This chapter introduced the bare essentials of advanced unstructured data analytics, and demonstrated how to use NLTK to go beyond the sentence parsing that was introduced in ChapterÂ 7, putting together the rest of an NLP pipeline and extraction entities from text. The field of computational linguistics is still quite nascent, and nailing the problem of NLP for most of the worldâs most commonly spoken languages is arguably the problem of the century. Push NLTK to its limits, and when you need more performance or quality, consider rolling up your sleeves and digging into some of the academic literature. Itâs admittedly a daunting task, but a truly worthy problem if you are interested in tackling it.

If youâd like to expand on the contents of this chapter, consider using NLTKâs word-stemming tools to try to compute (entity, stemmed predicate, entity) tuples, building upon the code in ExampleÂ 8-7. You might also look into WordNet, a tool that youâll undoubtedly run into sooner rather than later, to discover additional meaning about the items in the tuples. If you find yourself with copious free time on your hands, consider taking a look at some of the many popular commenting APIs, such as DISQUS, and try to incorporate the NLP techniques weâve covered into the comments streams for blog posts. Crafting a WordPress plug-in that intelligently suggests tags based upon the entities that are extracted from a draft blog post would also be a great way to spend ...

Get Mining the Social Web now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mining the Social Web by Matthew A. Russell

Closing Remarks

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly