Time for action – writing network traffic onto HDFS

This discussion of Flume in a book about Hadoop hasn't actually used Hadoop at all so far. Let's remedy that by writing data onto HDFS via Flume.

  1. Create the following file as agent4.conf within the Flume working directory:
    agent4.sources = netsource agent4.sinks = hdfssink agent4.channels = memorychannel agent4.sources.netsource.type = netcat agent4.sources.netsource.bind = localhost agent4.sources.netsource.port = 3000 agent4.sinks.hdfssink.type = hdfs agent4.sinks.hdfssink.hdfs.path = /flume agent4.sinks.hdfssink.hdfs.filePrefix = log agent4.sinks.hdfssink.hdfs.rollInterval = 0 agent4.sinks.hdfssink.hdfs.rollCount = 3 agent4.sinks.hdfssink.hdfs.fileType = DataStream agent4.channels.memorychannel.type ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.