Python Geoprocessing with Hadoop

Most of the examples in this book worked with relatively small datasets using a single computer. But as data gets larger, the datasets and even individual files may be spread out over a cluster of machines. Working with big data requires different tools. In this chapter, you will learn how to use Apache Hadoop to work with big data, and the Esri GIS tools for Hadoop to work with the big data spatially.

This chapter will teach you how to:

  • Install Linux
  • Install and run Docker
  • Install and configure a Hadoop environment
  • Work with files in HDFS
  • Basic queries using Hive
  • Install the Esri GIS tools for Hadoop
  • Perform spatial queries in Hive

Get Mastering Geospatial Analysis with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.