The RecordReader class
Unlike InputSplit
, the RecordReader
class presents a record view of the data to the Map task. RecordReader
works within each InputSplit
class and generates records from the data in the form of key-value pairs. The InputSplit
boundary is a guideline for RecordReader
and is not enforced. On one extreme, a custom RecordReader
class can be written to read an entire file (though this is not encouraged). Most often, a RecordReader
class will have to read from a subsequent InputSplit
class to present the complete record to the Map task. This happens when records overlap InputSplit classes.
The reading of bytes from a subsequent InputSplit
class happens via the FSDataInputS
tream
objects. Though this reading does not respect locality ...
Get Mastering Hadoop now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.