Checking a tar Archive for Unique Directories

Problem

You want to untar an archive, but you want to know beforehand into which directories it is going to write. You can look at the table of contents of the tarfile by using tar-t, but this output can be very large and it’s easy to miss something.

Solution

Use an awk script to parse off the directory names from the tar archive’s table of contents, then use sort -u to leave you with just the unique directory names:

$ tar tf some.tar | awk -F/ '{print $1}' | sort -u

Discussion

The t option will produce the table of contents for the file specified with the f option whose filename follows. The awk command specifies a non-default field separator by using -F/ to specify a slash as the separator between fields. Thus, the print $1 will print the first directory name in the pathname.

Finally, all the directory names will be sorted and only unique ones will be printed.

If a line of the output contains a single period then some files will be extracted into the current directory when you unpack this tar file, so be sure to be in the directory you desire.

Similarly, if the filenames in the archive are all local and without a leading ./ then you will get a list of filenames that will be created in the current directory.

If the output contains a blank line, that means that some of the files are specified with absolute pathnames (i.e., beginning with /), so again be careful, as extracting such an archive might clobber something that you don’t want replaced. ...

Get bash Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.