Appendix D. The Essential Guide to Perl Data Munging

Oracle DBAs spend a great deal of time handling data that for one reason or another needs to be cleaned, transformed, and/or formatted. They need to fill Oracle data warehouses with customer data from multiple sources, import data into Oracle databases from non-Oracle data streams, and convert and format source material of all kinds. Whether it’s an XML stream from a web page, a SQL*Loader feed from a telecom switch, or a snapshot transfer from another database, DBAs must ensure that these data transfers are clean, accurate, and timely. Unfortunately, the raw data they’re given to work with is often dirty, inaccurate, behind schedule, and unfit for SQL*Loader. This is a job for Perl and its wonderful world of data munging!

Data munging, the process of transforming data as it is transferred from one place to another, is a topic that is increasingly important for Oracle DBAs to understand. It is also an operation that Perl is particularly good at. Perl DBI’s innate ability to deal with multiple database types simultaneously also makes the transfer of data from one database to another as simple as lining up dominoes!

This appendix presents the basics of data munging and illustrates a typical data-munging operation — importing a MySQL data stream into an Oracle database, transforming it as necessary. We’ll also describe the many Perl data-munging modules that you can download from CPAN and use in conjunction with Oracle databases. ...

Get Perl for Oracle DBAs now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.