Chapter 5. Sources and Channel Selectors

Now that we have covered channels and sinks, we will now cover some of the more common ways to get data into your Flume agents. As discussed in Chapter 1, Overview and Architecture, the source is the input point for the Flume agent. There are many sources available with the Flume distribution as well as many open source options available. Like most open source software, if you can't find what you need, you can always write your own by extending the org.apache.flume.source.AbstractSource class. Since the primary focus of this book is ingesting files of logs into Hadoop, we'll cover a few of the more appropriate sources to accomplish this.

The problem with using tail

If you have used any of the Flume 0.9 releases, ...

Get Apache Flume: Distributed Log Collection for Hadoop - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.