Get full access to Bash Cookbook and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Finding and deleting duplicate files or directories

At one point, we had already talked about checking to see if strings inside of a file were unique and if we could sort them, but we haven't yet performed a similar operation on files. However, before diving in, let's make some assumptions about what constitutes a duplicate file for the purpose of this recipe: a duplicate file is one that may have a different name, but the same contents as another.

One way to investigate the contents of a file would be to remove all white space and purely check the strings contained within, or we could merely use tools such as SHA512sum and MD5sum to generate a unique hash (think unique string full of gibberish) of the contents of the files. The general flow ...

Get Bash Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now