Epilogue
Recall the scenario with which this chapter began:
Suppose that you have a file named somedata.csv that contains 12 columns of data
in comma-separated values (CSV) format. From this file you want to
extract only columns 2, 11, 5, and 9, and use them to create database
rows in a MySQL table that contains name
, birth
, height
, and weight
columns. You need to make sure that the
height and weight are positive integers, and convert the birth dates
from MM/DD/YY
format to
CCYY-MM-DD
format. How can you do
this?
So…how would you do that, based on the techniques discussed in this chapter?
Much of the work can be done using the utility programs developed here. You can convert the file to tab-delimited format with cvt_file.pl, extract the columns in the desired order with yank_col.pl, and rewrite the date column to ISO format with cvt_date.pl:
%cvt_file.pl --iformat=csv somedata.csv \
| yank_col.pl --columns=2,11,5,9 \
| cvt_date.pl --columns=2 --iformat=us --add-century > tmp
The resulting file, tmp, will
have four columns representing the name
, birth
, height
, and weight
values, in that order. It needs only to
have its height and weight columns checked to make sure they contain
positive integers. Using the is_positive_integer()
library function
from the Cookbook_Utils.pm module
file, that task can be achieved using a short special-purpose script
that isn’t much more than an input loop:
#!/usr/bin/perl # validate_htwt.pl - height/weight validation example # Assumes tab-delimited, linefeed-terminated ...
Get MySQL Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.