Extracting and Rearranging Datafile Columns

Problem

You want to pull out columns from a datafile or rearrange them into a different order.

Solution

Use a utility that can produce columns from a file on demand.

Discussion

cvt_file.pl serves as a tool that converts entire files from one format to another. Another common datafile operation is to manipulate its columns. This is necessary, for example, when importing a file into a program that doesn’t understand how to extract or rearrange input columns for itself. Perhaps you want to omit columns from the middle of a file so you can use it with LOAD DATA, which cannot skip over columns in the middle of data lines. Or perhaps you have a version of mysqlimport older than 3.23.17, which doesn’t support the --columns option that allows you to indicate the order in which table columns appear in the file. To work around these problems, you can rearrange the datafile instead.

Recall that this chapter began with a description of a scenario involving a 12-column CSV file somedata.csv from which only columns 2, 11, 5, and 9 were needed. You can convert the file to tab-delimited format like this:

% cvt_file.pl --iformat=csv somedata.csv > somedata.txt

But then what? If you just want to knock out a short script to extract those specific four columns, that’s fairly easy: write a loop that reads input lines and writes only the columns you want in the proper order. Assuming input in tab-delimited, linefeed-terminated format, a simple Perl program to pull ...

Get MySQL Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.