Shaking the XML Tree
Parsing well-formed and valid XML is much easier parsing than the Sheriff’s html. An XML parsing package is available for R; here’s how to install it from CRAN’s repository:
> install.packages("XML")
> library("XML")
Warning
If you are behind a firewall or proxy and getting errors:
On Unix: Set your http_proxy environment variable.
On Windows: try the custom install R wizard with internet2 option instead of “standard”. Click for additional info.
Our goal is to extract values contained within the <Latitude>
and <Longitude>
leaf nodes. These nodes live
within the <Result>
node, which
lives inside a <ResultSet>
node,
which itself lies inside the root node
To find an appropriate library for getting these values, call library(help=XML). This function lists the functions in the XML package.
> library(help=XML)
#hit space to scroll, q to exit> ?xmlTreeParse
I see the function xmlTreeParse
will accept an XML file or url and return an R structure. Paste in this
block after inserting your Yahoo App ID.
> library(XML)
> appid<-'
<put your appid here>'
> street<-"1 South Broad Street"
> requestUrl<-paste(
"http://local.yahooapis.com/MapsService/V1/geocode?appid=",
appid,
"&street=",
URLencode(street),
"&city=Philadelphia&state=PA"
,sep="")
> xmlResult<-xmlTreeParse(requestUrl,isURL=TRUE)
Warning
Are you behind a firewall or proxy in windows and this example is giving you trouble?
xmlTreeParse
has no respect
for your proxy settings. Do the following:
> Sys.setenv("http_proxy" ...
Get Data Mashups in R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.