O'Reilly logo

Agile Data Science by Russell Jurney

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 3. Agile Tools

Introduction

In this chapter we will briefly introduce our software stack. This stack is optimized for our process. By the end of this chapter, we’ll have you collecting, storing, processing, publishing and decorating data. Our stack enables one person to do all of this, to go ‘full stack.’ We’ll cover a lot, and we go quickly, but don’t worry: we will continue to demonstrate this software stack in Chapters 6 through 11. You need only understand the basics now, you will get more comfortable later.

We begin with instructions for running our stack in local mode on your own machine. In the next chapter, we go on to show you how to scale this same stack in the cloud via Amazon Web Services. Lets get started.

Example Code

Code examples for this chapter are available at https://github.com/rjurney/Agile_Data_Code/tree/master/ch03. Clone the repository and follow along!

git clone https://github.com/rjurney/Agile_Data_Code.git

Scalability = Simplicity

As NoSQL tools like Hadoop, MongoDB, the emerging field of data science and ‘Big Data’ have developed, much focus has been placed on the plumbing of analytics applications. In this book, we are teaching you to build applications that use such infrastructure. We will take this plumbing for granted and build applications that depend on it. As such, we are devoting only two chapters to infrastructure. One on introducing our development tools, and one on scaling them up in the cloud to match our data’s scale.

In choosing our tools ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required