Ignoring Datafile Columns

Problem

Your datafile contains columns that should be ignored rather than loaded into the table.

Solution

That’s not a problem if the columns are at the ends of the input lines. Otherwise, you can use a column list with LOAD DATA that assigns the columns to be ignored to a dummy user-defined variable.

Discussion

Extra columns that occur at the end of input lines are easy to handle. If a line contains more columns than are in the table, LOAD DATA just ignores them (although it might produce a nonzero warning count).

Skipping columns in the middle of lines is a bit more involved. Suppose that you want to load information from a Unix password file /etc/passwd, which contains lines in the following format:

account:password:UID:GID:GECOS:directory:shell

Suppose also that you don’t want to bother loading the password column. A table to hold the information in the other columns looks like this:

CREATE TABLE passwd
(
  account   CHAR(8),  # login name
  uid       INT,      # user ID
  gid       INT,      # group ID
  gecos     CHAR(60), # name, phone, office, etc.
  directory CHAR(60), # home directory
  shell     CHAR(60)  # command interpreter
);

To load the file, we need to specify that the column delimiter is a colon, which is easily handled with a FIELDS clause:

FIELDS TERMINATED BY ':'

However, we must also tell LOAD DATA to skip the second field that contains the password. To do this, add a column list in the statement. The list should include the name of each column to be loaded into the table, and a dummy ...

Get MySQL Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.