Validating and Transforming Data
Problem
You need to make sure that the data values contained in a file are legal.
Solution
Check them, possibly rewriting them into a more suitable format.
Discussion
Earlier recipes in this chapter show how to work with the structural characteristics of files, by reading lines and breaking them up into separate columns. It’s important to be able to do that, but sometimes you need to work with the data content of a file, not just its structure:
It’s often a good idea to validate data values to make sure they’re legal for the data types into which you’re storing them. For example, you can make sure that values intended for
INT
,DATE
, andENUM
columns are integers, dates inCCYY-MM-DD
format, and legal enumeration values, respectively.Data values may need reformatting. Rewriting dates from one format to another is especially common; for example, if a program writes dates in
MM-DD-YY
format to ISO format for import into MySQL. If a program understands only date and time formats and not a combined date-and-time format (such as MySQL uses for theDATETIME
andTIMESTAMP
data types), you need to split date-and-time values into separate date and time values.It may be necessary to recognize special values in the file. It’s common to represent
NULL
with a value that does not otherwise occur in the file, such as-1
,Unknown
, orN/A
. If you don’t want those values to be imported literally, you need to recognize and handle them specially.
This is the first of a set ...
Get MySQL Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.