Chapter 2. Installing and Running Drill

Drill is a layered Java application. At the core is a set of Java JAR files that you can include in another Java application. Drill calls this the embedded mode. The sqlline tool we’ll discuss is one example; the JDBC driver is another. The embedded mode is handy for learning Drill or to work with datasets stored on your local machine.

More typically, however, you run Drill as a server. Drill calls this the server mode. The Drill server, called a Drillbit, is just a wrapper around the core Drill library, so you get much of the same functionality either way. You can run a single Drillbit, or you can run multiple Drillbits on a cluster. Drill calls this distributed mode. The key advantage is that distributed mode provides much easier access to distributed filesystems, such as HDFS or S3.

You can run a single Drillbit on your laptop, which lets you try out the server features that you’ll use in production, including working with distributed filesystems. It is important to note that you can do a lot with Drill without requiring a Hadoop cluster at your disposal. When it’s installed on a cluster you work with Drill in exactly the same way as on your laptop, but now Drill will distribute its load across multiple machines. When you run Drill as a server, you will also need ZooKeeper running.

This chapter explains how to install Drill on your laptop, run it in both embedded and server modes, and configure your system to work with Drill. Chapter 9 ...

Get Learning Apache Drill now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.