7.23. Extract the Filename from a Windows Path

Problem

You have a string that holds a (syntactically) valid path to a file or folder on a Windows PC or network, and you want to extract the filename, if any, from the path. For example, you want to extract file.ext from c:\folder\file.ext.

Solution

[^\\/:*?"<>|\r\n]+$
Regex options: Case insensitive
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Discussion

Extracting the filename from a string known to hold a valid path is trivial, even if you don’t know whether the path actually ends with a filename.

The filename always occurs at the end of the string. It can’t contain any colons or backslashes, so it cannot be confused with folders, drive letters, or network shares, which all use backslashes and/or colons.

The anchor $ matches at the end of the string (Recipe 2.5). The fact that the dollar also matches at embedded line breaks in Ruby doesn’t matter, because valid Windows paths don’t include line breaks. The negated character class [^\\/:*?"<>|\r\n]+ (Recipe 2.3) matches the characters that can occur in filenames. Though the regex engine scans the string from left to right, the anchor at the end of the regex makes sure that only the last run of filename characters in the string will be matched, giving us our filename.

If the string ends with a backslash, as it will for paths that don’t specify a filename, the regex won’t match at all. When it does match, it will match only the filename, so we don’t need to use any capturing ...

Get Regular Expressions Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.