Kite Data

The Kite SDK (http://www.kitesdk.org) is a collection of classes, command-line tools, and examples that aims at easing the process of building applications on top of Hadoop.

In this section we will look at how Kite Data, a subproject of Kite, can ease integration with several components of a Hadoop data warehouse. Kite examples can be found at https://github.com/kite-sdk/kite-examples.

On Cloudera's QuickStart VM, Kite JARs can be found at /opt/cloudera/parcels/CDH/lib/kite/.

Kite Data is organized in a number of subprojects, some of which we'll describe in the following sections.

Data Core

As the name suggests, the core is the building block for all capabilities provided in the Data module. Its principal abstractions are datasets and repositories. ...

Get Learning Hadoop 2 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.