Part Three A BAG OF CASE STUDIES

Overview of all case studies

Scraping and information Main Important
Case study Description extraction via... packages functions
Collaboration Networks in the U.S. Senate Scraping of bill cosponsorship data from the US Senate at thomas.loc.gov, assessment of collaboration network structure URL manipulation, regular expressions RCurl, stringr, igraph getURL(), str_extract(), graph.edgelist(), get.adjacency()
Parsing Information from Semi-Structured Documents Scraping of climate data from Californian weather stations (ftp.wcc.nrcs.usda.gov), construction of a regex-based parser FTP download, regular expressions and string manipulation tools RCurl, stringr getURL(), str_extract(), str_replace()
Predicting the 2014 Academy Awards using Twitter Collection of tweets from Twitter API (dev.twitter.com/docs/api/streaming), frequency-based prediction of Oscar winners Persistent connection to Streaming API via streamR, regular expressions streamR, twitteR, lubridate, stringr, plyr filterStream(), parseTweets(), str_detect(), agrep()
Mapping the Geographic Distribution of Names Scraping phone book data from dastelefonbuch.de, extraction of zip codes and matching with geo-coordinates, creation of family name maps HTML forms,XPath and regular expressions, R geographic functionality RCurl, stringr, XML, maptools, maps, rgdal getForm(), htmlParse(), xpathSApply(), str_extract(), function()
Gathering Data on Mobile Phones Scraping of mobile ...

Get Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.