Reading Files from Different Operating Systems

Problem

Different operating systems use different line-ending sequences.

Solution

That’s why LOAD DATA has a LINES TERMINATED BY clause.

Discussion

The line-ending sequence used in a datafile typically is determined by the system on which the file originates, not the system on which you import it. Keep this in mind when loading a file that is obtained from a different system.

Unix files normally have lines terminated by linefeeds, which you can indicate in a LOAD DATA statement like this:

LINES TERMINATED BY '\n'

However, because \n happens to be the default line terminator for LOAD DATA, you don’t need to specify a LINES TERMINATED BY clause in this case unless you want to indicate explicitly what the line ending sequence is.

Files created under Mac OS or Windows usually have lines ending in carriage returns or carriage return/linefeed pairs. To handle these different kinds of line endings, use the appropriate LINES TERMINATED BY clause:

LINES TERMINATED BY '\r'
LINES TERMINATED BY '\r\n'

For example, to load a Windows file that contains tab-delimited fields and lines ending with CRLF pairs, use this LOAD DATA statement:

mysql> LOAD DATA LOCAL INFILE 'mytbl.txt' INTO TABLE mytbl
    -> LINES TERMINATED BY '\r\n';

The corresponding mysqlimport command is:

% mysqlimport --local --lines-terminated-by="\r\n" cookbook mytbl.txt

Get MySQL Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.