Finding and deleting duplicate files or directories

At one point, we had already talked about checking to see if strings inside of a file were unique and if we could sort them, but we haven't yet performed a similar operation on files. However, before diving in, let's make some assumptions about what constitutes a duplicate file for the purpose of this recipe: a duplicate file is one that may have a different name, but the same contents as another.

One way to investigate the contents of a file would be to remove all white space and purely check the strings contained within, or we could merely use tools such as SHA512sum and MD5sum to generate a unique hash (think unique string full of gibberish) of the contents of the files. The general flow ...

Get Bash Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.