Data Mashups in R

Jeremy Leipzig

Xiao-Yi Li

June 5, 2009

Abstract

This article demonstrates how the real-world data is imported, managed, visualized, and analyzed within the R statistical framework. Presented as a spatial mashup, this tutorial introduces the user to R packages, R syntax, and data structures. The user will learn how the R environment works with R packages as well as its own capabilities in statistical analysis. We will be accessing spatial data in several formats—html, xml, shapefiles, and text—locally and over the web to produce a map of home foreclosure auctions and perform statistical analysis on these events.

Programmers can spend good part of their careers scripting code to conform to commercial statistics packages, visualization tools, and domain-specific third-party software. The same tasks can force end users to spend countless hours in copy-paste purgatory, each minor change necessitating another grueling round of formatting tabs and screenshots. R scripting provides some reprieve. Because this open source project garners support of a large community of package developers, the R statistical programming environment provides an amazing level of extensibility. Data from a multitude of sources can be imported into R and processed using R packages to aid statistical analysis and visualization. R scripts can also be configured to ...

Get Data Mashups in R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.