Reading XML data

You may sometimes need to extract data from websites. Many providers also supply data in XML and JSON formats. In this recipe, we learn about reading XML data.

Getting ready

If the XML package is not already installed in your R environment, install the package now as follows:

> install.packages("XML")

How to do it...

XML data can be read by following these steps:

  1. Load the library and initialize:
    > library(XML)
    > url <- "http://www.w3schools.com/xml/cd_catalog.xml"
  2. Parse the XML file and get the root node:
    > xmldoc <- xmlParse(url)
    > rootNode <- xmlRoot(xmldoc)
    > rootNode[1]
  3. Extract XML data:
    > data <- xmlSApply(rootNode,function(x) xmlSApply(x, xmlValue))
  4. Convert the extracted data into a data frame:
    > cd.catalog <- data.frame(t(data),row.names=NULL) ...

Get R: Recipes for Analysis, Visualization and Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.