Appendix A. List of Command-Line Tools

This is an overview of all the command-line tools discussed in this book. This includes binary executables, interpreted scripts, and Bash builtins and keywords. For each command-line tool, the following information, when available and appropriate, is provided:

  • The actual command to type at the commandline

  • A description

  • The name of the package it belongs to

  • The version used in the book

  • The year that version was released

  • The primary author(s)

  • A website to find more information

  • How to install it

  • How to obtain help

  • An example usage

All command-line tools listed here are included in the Data Science Toolbox for Data Science at the Command Line. See Chapter 2 for instructions on how to set it up. The install commands assume that you’re running Ubuntu 14.04. Please note that citing open source software is not trivial, and that some information may be missing or incorrect.

alias

Define or display aliases. Alias is a Bash builtin.

$ help alias
$ alias ll='ls -alF'

awk

Pattern scanning and text processing language. Mawk (version 1.3.3) by Mike Brennan (1994). http://invisible-island.net/mawk.

$ sudo apt-get install mawk
$ man awk
$ seq 5 | awk '{sum+=$1} END {print sum}'
15

aws

Manage AWS Services such as EC2 and S3 from the command line. AWS Command Line Interface (version 1.3.24) by Amazon Web Services (2014). http://aws.amazon.com/cli.

 $ sudo pip install awscli $ aws help $ aws ec2 describe-regions | head -n 5 { "Regions": ...

Get Data Science at the Command Line now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.