Handling Filenames Containing Odd Characters

Problem

You used a find command like the one in Finding All Your MP3 Files but the results were not what you intended because many of your filenames contain odd characters.

Solution

First, understand that to Unix folks, odd means “anything not a lowercase letter, or maybe a number.” So uppercase, spaces, punctuation, and character accents are all odd. But you’ll find all of those and more in the names of many songs and bands.

Depending on the oddness of the characters, your system, tools, and goal, it might be enough to simply quote the replacement string (i.e., put single quotes around the {}, as in '{}') . You did test your command first, right?

If that’s no good, try using the -print0 argument to find and the -0 argument to xargs. -print0 tells find to use the null character (\0) instead of whitespace as the output delimiter between pathnames found. -0 then tells xargs the input delimiter. These will always work, but they are not supported on every system.

The xargs command takes whitespace delimited (except when using -0) pathnames from standard input and executes a specified command on as many of them as possible (up to a bit less than the system’s ARG_MAX value; see Working Around “argument list too long” Errors). Since there is a lot of overhead associated with calling other commands, using xargs can drastically speed up operations because you are calling the other command as few times as possible, rather than each time a pathname is ...

Get bash Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.