Cover image for Computer Science & Perl Programming

Book description

In its first five years of existence, The Perl Journal ran 247 articles by over 120 authors. Every serious Perl programmer subscribed to it, and every notable Perl guru jumped at the opportunity to write for it. TPJ explained critical topics such as regular expressions, databases, and object-oriented programming, and demonstrated Perl's utility for fields as diverse as astronomy, biology, economics, AI, and games. The magazine gave birth to both the Obfuscated Perl Contest and the Perl Poetry contest, and remains a proud and timeless achievement of Perl during one of its most exciting periods of development. Computer Science and Perl Programming is the first volume of The Best of the Perl Journal, compiled and re-edited by the original editor and publisher of The Perl Journal, Jon Orwant. In this series, we've taken the very best (and still relevant) articles published in TPJ over its 5 years of publication and immortalized them into three volumes. This volume has 70 articles devoted to hard-core computer science, advanced programming techniques, and the underlying mechanics of Perl. Here's a sample of what you'll find inside:

  • Jeffrey Friedl on Understanding Regexes

  • Mark Jason Dominus on optimizing your Perl programs with Memoization

  • Damian Conway on Parsing

  • Tim Meadowcroft on integrating Perl with Microsoft Office

  • Larry Wall on the culture of Perl

Written by 41 of the most prominent and prolific members of the closely-knit Perl community, this anthology does what no other book can, giving unique insight into the real-life applications and powerful techniques made possible by Perl. Other books tell you how to use Perl, but this book goes far beyond that: it shows you not only how to use Perl, but what you could use Perl for. This is more than just The Best of the Perl Journal -- in many ways, this is the best of Perl.

Table of Contents

  1. Special Upgrade Offer
  2. A Note Regarding Supplemental Files
  3. Foreword
  4. Preface
    1. Finding Perl Resources
    2. Conventions Used in This Book
    3. Comments and Questions
    4. Acknowledgments
  5. 1. Introduction
    1. History of TPJ
    2. Computer Science and Perl Programming
  6. I. Beginner Concepts
    1. 2. All About Arrays
      1. Basics
      2. Positions
      3. Position Versus Count
      4. Foreach Loops
      5. The Reverse and Sort Functions
      6. Slices
      7. Adding and Deleting Values
      8. Lists to Strings and Back Again
      9. Putting It All Together
    2. 3. Perfect Programming
      1. Warnings with -w
      2. The strict Pragma
      3. Tainting and Safe
      4. Checking Return Values
      5. Planning for Failure
      6. The Perl Debugger
      7. The Perl Profiler
      8. Stack Traces
    3. 4. Precedence
      1. What Is Precedence?
      2. Rules and More Rules
      3. An Explosion of Rules
      4. Precedence Traps and Surprises
      5. List Operators and Unary Operators
      6. Complete Rules of Precedence
      7. How to Remember All the Rules
      8. Quiz
      9. Answers
    4. 5. The Birth of a One-Liner
    5. 6. Comparators, Sorting, and Hashes
      1. Sorting
        1. The Simplest Way to Sort
        2. Tinkering with the Sort
      2. Sorting Hashes
        1. Sorting by Key
        2. Sorting a Hash by Value
        3. Sorting a Hash by Key and Value
      3. Efficient Sorting
      4. Further Reading
    6. 7. What Is Truth?
      1. The undef Function
      2. Back to Truth
      3. Truth in Context
      4. Applications
      5. Conclusion
    7. 8. Using Object-Oriented Modules
      1. Modules and Their Functional Interfaces
      2. Modules with Object-Oriented Interfaces
        1. Class Methods
        2. Object Methods
      3. What Can You Do with Objects?
      4. What’s in an Object?
      5. What Is an Object Value?
      6. So Why Do Some Modules Use Objects?
    8. 9. Unreal Numbers
      1. A Surprising Program
      2. The Right Way
    9. 10. CryptoContext
      1. Context
      2. Prototypes
      3. Subroutine Calls
      4. Putting Them All Together
      5. Conclusion
        1. Context Is Subtle
        2. Prototypes Are a Mixed Blessing
    10. 11. References
      1. Who Needs Complicated Data Structures?
      2. The Solution
      3. Making References
      4. Using References
      5. An Example
      6. Solution
      7. The Rest
      8. In Summary
    11. 12. Perl Heresies
      1. Don’t Use -w
      2. Don’t Use Regular Expressions Just Because They’re Cool
      3. Don’t Always Use Modules
      4. Partial Solutions Are Okay
  7. II. Regular Expressions
    1. 13. Understanding Regular Expressions, Part I
      1. The Story of Fred
      2. Reality Check
      3. Regular Expression Background
        1. DFA Versus NFA
        2. NFA Versus NFA, DFA Versus DFA
      4. Perl Regex Engine Basics
        1. A Sample Regex
        2. “The Longest Match Wins” and Other Myths
      5. The First Real Rule of Regexes
      6. A Single Match Attempt
        1. Multiple Paths
        2. Backtracking
      7. Options, Options, Options
        1. Alternation
        2. Character Classes
      8. How the Path Is Chosen
      9. That’s Pretty Much It
    2. 14. Understanding Regular Expressions, Part II
      1. Knowing Versus Knowing on Paper
        1. Will They Work at All?
        2. How Do They Differ?
        3. Which Is Best?
      2. Efficiency
        1. Greediness
        2. Logical “or” Versus Regex “or”
      3. Benchmarking
      4. Conclusion
    3. 15. Understanding Regular Expressions, Part III
    4. 16. Nibbling Strings
      1. The Problem
      2. Going on a Diet
    5. 17. How Regexes Work
      1. Machines
      2. Blank Arrows
      3. Rules Again
      4. How to Turn a Regex into a Penny Machine
      5. What Do You Mean, Done?
      6. The Regex Module
      7. Implications for Perl
      8. What About Backreferences?
      9. Internals of Regex.pm
      10. Lies
      11. Other Directions
      12. Bibliography
  8. III. Computer Science
    1. 18. Infinite Lists
      1. Hamming’s Problem
      2. Streams
      3. Hamming’s Problem Revisited
      4. Dataflow Programming
      5. Other Directions
      6. References
    2. 19. Compression
      1. Morse Code
      2. Ambiguous Codes
      3. Huffman Coding
      4. The Code
      5. The Rub
      6. Another Rub
      7. Other Methods
      8. Other Directions
      9. Bibliography
    3. 20. Memoization
      1. Recursive Functions
      2. The Memoize Module
      3. Module Internals
      4. Some Other Applications of Memoization
        1. Persistent Cache
        2. Profiling Execution Speed
        3. The Orcish Maneuver
        4. Dynamic Programming
      5. When Memoizing Doesn’t Work
      6. Bibliography
    4. 21. Parsing
      1. A Sample Parse
      2. Formal Grammars
      3. The Different Types of Parsers
        1. Bottom-Up Parsers
        2. Top-Down Parsers
        3. The Descent of RecDescent
      4. Building a Parser with Parse::RecDescent
      5. How Parse::RecDescent Works
        1. Handling Items
        2. Handling Repeated Items
        3. No Lexer
        4. Error Handling
      6. An In-Depth Example
        1. Freeform Grammar
      7. Advanced Features of Parse::RecDescent
        1. Automated Error Reporting
        2. Integrated Tracing Facilities
        3. Position Information Within Actions
        4. Parse Tree Pruning
        5. Deferred Actions
        6. Extensible Grammars
      8. Practical Applications of Parsing
      9. Limitations of Parse::Recdescent
        1. No Left-Recursion
        2. Coming Attractions
      10. More Information
      11. Acknowledgments
    5. 22. Trees and Game Trees
      1. What Is a Tree?
      2. Formal Definition
      3. Markup Language Trees
      4. Building Your Own Trees
      5. An Implementation: Game Trees for Alak
        1. Digression: Links to Parents
        2. Recursively Printing the Tree
        3. Growing the Tree
      6. References
    6. 23. B_Trees
      1. A Review of Binary Trees
      2. The Problem with Binary Trees
      3. B-Trees Are Always Balanced
      4. A Guided Tour of the Program
      5. Moving Down
      6. Moving Up
      7. Details
      8. Other Directions
      9. Bibliography
    7. 24. Making Life and Death Decisions with Perl
      1. Probability Theory
      2. Whoa!
      3. Perl
      4. Last Words
    8. 25. Information Retrieval
      1. Text Searches on Manual Pages
      2. The Implementation
      3. Relevance Feedback
      4. “Advanced” Search Operators
      5. Conclusion
      6. References
    9. 26. Randomness
      1. Congruential Generators
      2. Choosing the Seed
      3. LFSRs
      4. References
    10. 27. Random Number Generators and XS
      1. Random Versus Pseudorandom Numbers
      2. Linear Congruential Generators Revisited
        1. It’s Not That Bad
        2. It’s Not Good, Either
      3. A Better Generator for Perl
      4. Bridging C and Perl with XS
        1. XS Overview
        2. Types and the Typemap
      5. Acknowledgments
      6. References
  9. IV. Programming Techniques
    1. 28. Suffering from Buffering
      1. What Is Buffering?
      2. Surprise!
      3. Disabling Inappropriate Buffering
      4. Hot and Not Hot
      5. Other Perils of Buffering
        1. “My Output Is Coming Out in the Wrong Order!”
        2. “My Web Server Says I Didn’t Send the Right Headers, but I’m Sure I Did!”
        3. “I’m Trying to Send Data over the Network, but Nothing Is Sent!”
        4. “When My Program Terminates Abnormally, the Output Is Incomplete!”
      6. Flushing on Command
      7. Other Directions
      8. Summary
    2. 29. Scoping
      1. Package Variables
      2. The Current Package
      3. Package Variable Trivia
      4. Lexical Variables
      5. local and my
      6. What Good Is local?
      7. When to Use my and When to Use local
      8. Other Properties of my Variables
      9. my Variable Trivia
      10. Declarations
        1. use vars and our
      11. Summary
    3. 30. Seven Useful Uses of local
      1. 1. Special Variables
      2. 2. Localized Filehandles
        1. Localized Filehandles Revisited
        2. Marginal Uses of Localized Filehandles
        3. Dirhandles
      3. 3. The First Class Filehandle Trick
      4. 4. Aliases
      5. 5. Dynamic Scope
      6. 6. Dynamic Scope Revisited
        1. Marginal Uses of Dynamic Scoping
      7. 7. Perl 4 and Other Relics
      8. Summary
    4. 31. Parsing Command-Line Options
      1. Option Parsing Conventions
      2. The Simplest Way
      3. The Easy Way
      4. The Advanced Way
        1. Option Words
        2. Using and Bundling Single-Letter Options
        3. Advanced Destinations
        4. Other Configurations
        5. Help Messages
      5. Other Option Handling Modules
    5. 32. Building a Better Hash with tie
      1. Introduction
      2. The Problem
      3. Discussion
      4. Attempted Solutions
        1. Check for Built-In Support
        2. See If a Solution Already Exists
      5. A Working Data Structure
      6. Implementation
      7. Implementing a Tied Hash
      8. Using a Tied Hash
      9. Testing
      10. Optimizations
        1. Time
        2. Space
      11. Making It a Module
      12. Summing Up
      13. References
    6. 33. Source Filters
      1. Concepts
      2. Using Filters
      3. Writing a Source Filter
        1. Writing a Source Filter in C
        2. Creating a Source Filter as a Separate Executable
        3. Writing a Source Filter in Perl
      4. The Debug Filter
      5. Conclusion
    7. 34. Overloading
      1. Defining Your Own Types
      2. Adding Methods to the Date Class
      3. A Minor Problem
      4. Introducing Overloading
      5. Overloading More Methods
      6. Overloading and Associativity
      7. Full Overloading Implementations
      8. Automatically Generating Overloaded Methods
      9. The Fallback Mechanism
      10. Overloading and Inheritance
      11. Limitations of Operator Overloading
      12. Conclusion
      13. References
    8. 35. Building Objects Out of Arrays
      1. OO Basics
      2. Arrays Are Faster
      3. Arrays Use Less Space
      4. Arrays Can Prevent Attribute Collisions
      5. Arrays Can Prevent Misspellings
      6. Disadvantages
      7. Other Approaches
    9. 36. Hiding Objects with Closures
      1. A Simple Example
      2. Closures
      3. What About Inheritance?
      4. Conclusion
    10. 37. Multiple Dispatch in Perl
      1. Multiple Dispatch
      2. Multiple Dispatch via “Tests-in-Methods”
      3. Multiple Dispatch via a Table
        1. Initializing the Dispatch Table
        2. Choosing the Initialization Order
      4. Comparing the Two Approaches
      5. Dynamic Dispatch Tables
        1. The Costs of Extending the Dispatch Table
        2. Multiple Dispatch and Subroutine Overloading
      6. The Class::Multimethods Module
  10. V. Software Development
    1. 38. Using Other Languages from Perl
      1. Introducing Inline.pm
      2. A More Complex Example
      3. Calling C Functions from Perl
      4. Manipulating Perl’s Stack
      5. How Inline Works
      6. Creating Perl Extensions
      7. Inline::Config
      8. XS and SWIG
      9. Using Perl as C’s Memory Manager
      10. Benchmarks
    2. 39. SWIG
      1. Hooks by Hand
      2. Wrapping a C Function
      3. Interface Files
      4. An In-Depth Example: Emulating top
        1. From %{ to %}
        2. After the %{ … %} Block
      5. The top Emulator
      6. Conclusion
    3. 40. Benchmarking
      1. The Trouble with time( )
      2. Better Resolution with times
      3. The Benchmark Module
      4. Example: Summing an Array
      5. Conclusion
    4. 41. Building Software with Cons
      1. Make Doesn’t Do the Right Thing
        1. Build Sequencing
        2. Variant Builds
        3. Complexity
      2. The Solution: Cons
        1. Cons Scripts Are Perl Scripts
        2. Cons Does the Right Thing
        3. Explicit and Implicit Dependencies
        4. MD5 Cryptographic Signatures
        5. Automatic, Global Sequencing of Builds
      3. Summary
    5. 42. MakeMaker
      1. Reasons to Use MakeMaker
      2. h2xs
      3. Components of Makefile.PL
      4. A Deeper Example
      5. Advanced Makefile Features
      6. MakeMaker and Installation of Modules
      7. perllocal.pod
    6. 43. Autoloading Perl Code
      1. Why Autoload?
      2. Using the AutoLoader
      3. How Autoloading Works
      4. AutoSplitting Your Module
      5. AutoLoading Scripts
      6. AutoLoading C Programs
      7. Summary
    7. 44. Debugging and Devel::
      1. Runtime Examination of Data
      2. Profiling and Coverage Testing
      3. Reference Manipulation
      4. Helping C and C++ Programmers
      5. Rolling Your Own
        1. The DB:: Namespace
      6. Which Should You Use?
  11. VI. Networking
    1. 45. Email with Attachments
      1. What Is MIME, and Why Do I Care?
      2. How Does MIME Encode Data?
      3. Multiple Pieces of MIME
      4. How to Create a Mime Message
      5. An Alternate Route
      6. A Full-Blown Example
      7. Conclusion
    2. 46. Sending Mail Without sendmail
      1. Some Email Background
        1. A Store-and-Forward System
        2. There’s More Than One Way to Deliver Mail
        3. Standards Governing Email
      2. The Mail Itself
        1. The Message Body
        2. The Message Headers
        3. The Message Envelope
      3. Sending Mail in Six Easy Steps
        1. Step One: Connecting to the Remote SMTP Server
        2. Step Two: Identifying Yourself
        3. Step Three: Identifying the Mail Sender
        4. Step Four: Identifying the Mail Recipients
        5. Step Five: Sending the Mail
        6. Step Six: Closing the Connection
      4. What Next?
        1. Sending Mail with Mail::Mailer
        2. Sending Mail with Net::SMTP
        3. Talking Directly to the Mail Host
        4. Which Should You Choose?
    3. 47. Filtering Mail
      1. What Is It?
      2. A Very Simple Mail Filter
      3. Separating Mail into Folders
      4. Mail and News
      5. A Complete Filter
      6. Caveats
      7. Conclusion
    4. 48. Net::Telnet
      1. The Problem
      2. The Solution
      3. Telnetting the Hard Way
      4. Telnetting the Easy Way
      5. Telnetting the Easiest Way
      6. Special Considerations
      7. Other Features
    5. 49. Microsoft Office
      1. Background
      2. The Problem
      3. The Solution
      4. Wait, There’s More
    6. 50. Client-Server Applications
      1. Using the inetd Super-Daemon
      2. A Standalone Server
      3. A Threaded Server
      4. Launching Standalone Servers from inetd
      5. Further Information
    7. 51. Managing Streaming Audio
      1. Playlists, Streams, and ID3 Tags
      2. Apache::MP3
      3. Conclusion
    8. 52. A 74-Line Ip Telephone
      1. Sound Cards and /dev/dsp
      2. The Simple Version
      3. Adding an MP3 Encoder
      4. Summary
      5. References
    9. 53. Controlling Modems
      1. Initializing Your Modem
      2. Getting Your Modem to Dial
      3. To Block or Not to Block?
      4. What’s Next?
      5. Afterword
      6. References
    10. 54. Using Usenet from Perl
      1. Finding Newsgroups
      2. Retrieving Articles
      3. Posting Articles
    11. 55. Transferring Files with FTP
      1. A Simple Example
      2. Multiple FTP Connections
      3. Transferring Files Between Servers
    12. 56. Spidering an FTP Site
      1. Motivation
      2. Net::FTP
      3. Downloading a File (the Simple Case)
      4. Recursion
      5. Downloading a File Tree (the Recursive Case)
      6. Uploading a File (the Simple Case)
      7. Uploading a File Tree (the Recursive Case)
      8. Applications
    13. 57. DNS Updates with Perl
      1. DNS Basics
      2. DNS Servers
      3. Dynamic Update
      4. Setting Up Your Nameserver
      5. Delegating the Zone
      6. Using Net::DNS::Update
      7. Paths for Futher Exploration
  12. VII. Databases
    1. 58. DBI
      1. The Architecture of DBI
      2. Why DBI?
      3. The Modules
      4. Handles
      5. Resources
      6. Sample Code
    2. 59. Using DBI with Microsoft Access
      1. The Win32-Access-ODBC-DBI::DBD Checklist:
      2. References
    3. 60. DBI Caveats
      1. DBI and Loops
      2. Placeholders
      3. Fetches
      4. Bind Columns
      5. Error Checking
      6. Transactions
      7. References
    4. 61. Beyond Hardcoded Database Applications with DBIx::Recordset
      1. CRUD Without SQL
      2. Sample Usage
      3. A DBI Version
      4. Conclusion
    5. 62. Win32::ODBC
      1. Win32::ODBC Basics
      2. Demystifying SQL
      3. Installing Win32::ODBC
      4. Getting Started
      5. Debugging
      6. CRUD
      7. Transactions
      8. Data Sources
      9. Data Dictionary
      10. Conclusion
    6. 63. Net::LDAP
      1. What Is LDAP?
      2. Setting Up an OpenLDAP Server
      3. Loading Data into the Directory
      4. A Searchable Web Interface to Manage Your Directory
      5. Where LDAP Is Going
      6. References
    7. 64. Web Databases the Genome Project Way
      1. The ACEDB Database
      2. ACEDB Objects and Classes
      3. Accessing ACEDB from Perl
      4. ACEDB Meets the Web
      5. Registering ACEDB Displays
      6. Conclusions
      7. References
    8. 65. Spreadsheet::WriteExcel
      1. Using Spreadsheet::WriteExcel
      2. How the Spreadsheet::WriteExcel Module Works
        1. The Excel Binary Interchange File Format
        2. A Brief History of Time Wasted
        3. The pack Programming Language
        4. The Structure of the Module
      3. Alternative Ways of Writing to Excel
      4. Reading from Excel
      5. Win32::OLE
      6. Obtaining Spreadsheet::WriteExcel
      7. References
  13. VIII. Internals
    1. 66. How to Improve Perl
    2. 67. Components of the Perl Distribution
      1. The Components of Perl
        1. The Core
        2. The Standard Library
        3. Configuration and Installation
        4. Test Suite
        5. Utilities
      2. Summary
    3. 68. Basic Perl Anatomy
      1. How Perl Works
      2. Lexical Analysis
      3. Parsing
      4. Compilation
      5. Execution
      6. Perl Subsystems
      7. For Further Reading
    4. 69. Lexical Analysis
      1. Tokenizing
      2. Perl’s Lexer
      3. Lexer Variables
      4. Tokenizing Considerations
      5. Further Information
    5. 70. Debugging Perl Programs with -D
      1. What -D Does for You
      2. Trace Execution with -Dt
      3. Stack Snapshots with -Ds
      4. Syntax Tree Dump with -Dx
      5. Regular Expression Parsing and Execution with -Dr
      6. Method and Overloading Resolution with -Do
      7. Context (Loop) Stack Processing with -Dl
      8. Tokenizing and Parsing with -Dp
      9. Other -D Debugging Flags
    6. 71. Microperl
      1. Bootstrapping
      2. Building Microperl
      3. How Microperl Works
      4. Practical Uses for Microperl
      5. Problems
      6. Future Work
  14. Index
  15. Colophon
  16. Special Upgrade Offer
  17. Copyright