Converting a Data File to CSV
Problem
You have a data file that you need to convert to a Comma Separated Values (CSV) file.
Solution
Use awk to convert the data into CSV format:
$ awk 'BEGIN { FS="\t"; OFS="\",\"" } { gsub(/"/, "\"\""); $1 = $1; printf "\"%s\"\ n", $0}' tab_delimited "Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes"
You can do the same thing in Perl also:
$ perl -naF'\t' -e 'chomp @F; s/"/""/g for @F; print q(").join(q(","), @F).qq("\n);' tab_delimited "Line 1","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 2","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 3","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes" "Line 4","Field 2","Field 3","Field 4","Field 5 with ""internal"" double-quotes"
Discussion
First of all, it’s tricky to define exactly what CSV really means. There is no formal specification, and various vendors have implemented various versions. Our version here is very simple, and should hopefully work just about anywhere. We place double quotes around all fields (some implementations only quote strings, or strings with internal commas), and we double internal double quotes.
To do that, we have awk split up the input ...
Get bash Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.