Hadoop-specific data types

Up to this point we've glossed over the actual data types used as the input and output of the map and reduce classes. Let's take a look at them now.

The Writable and WritableComparable interfaces

If you browse the Hadoop API for the org.apache.hadoop.io package, you'll see some familiar classes such as Text and IntWritable along with others with the Writable suffix.

This package also contains the Writable interface specified as follows:

import java.io.DataInput ;
import java.io.DataOutput ;
import java.io.IOException ;

public interface Writable
{
void write(DataOutput out) throws IOException ;
void readFields(DataInput in) throws IOException ;
}

The main purpose of this interface is to provide mechanisms for the serialization ...

Get Hadoop Beginner's Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.