You want to change all field-delimiting commas in a CSV file to tabs. Commas that occur within double-quoted values should be left alone.
The following regular expression matches an individual CSV field along with its preceding delimiter, if any. The preceding delimiter is usually a comma, but can also be an empty string (i.e., nothing) when matching the first field of the first record, or a line break when matching the first field of any subsequent record. Every time a match is found, the field itself, including the double quotes that may surround it, is captured to backreference 2, and its preceding delimiter is captured to backreference 1.
The regular expressions in this recipe are designed to work correctly with valid CSV files only, according to the format rules discussed in Comma-Separated Values (CSV).
|Regex options: None|
Here is the same regular expression again in free-spacing mode:
( , | \r?\n | ^ ) # Capture the leading field delimiter to backref 1 ( # Capture a single field to backref 2: [^",\r\n]+ # Unquoted field | # Or: " (?:[^"]|"")* " # Quoted field (may contain escaped double quotes) )? # The group is optional because fields may be empty
|Regex options: Free-spacing|
|Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby|
Using this regex and the code in Recipe 3.11, you can iterate over your CSV file and ...