You are previewing Text Processing with Ruby.
O'Reilly logo
Text Processing with Ruby

Book Description

Text is everywhere. Web pages, databases, the contents of files--for almost any programming task you perform, you need to process text. Cut even the most complex text-based tasks down to size and learn how to master regular expressions, scrape information from Web pages, develop reusable utilities to process text in pipelines, and more.

Table of Contents

  1. Text Processing with Ruby
    1. For the Best Reading Experience...
    2. Table of Contents
    3. Early praise for Text Processing with Rub y
    4. Acknowledgments
    5. Introduction
      1. About This Book
      2. Online Resources
    6. Pa rt 1 Extract: Acquiring Text
      1. Chapter 1: Reading from Files
        1. Opening a File
        2. Reading from a File
        3. Treating Files as Streams
        4. Reading Fixed-Width Files
        5. Wrapping Up
      2. Chapter 2: Processing Standard Input
        1. Redirecting Input from Other Processes
        2. Example: Extracting URLs
        3. Concurrency and Buffering
        4. Wrapping Up
      3. Chapter 3: Shell One-Liners
        1. Arguments to the Ruby Interpreter
        2. Prepending and Appending Code
        3. Example: Parsing Log Files
        4. Wrapping Up
      4. Chapter 4: Flexible Filters with ARGF
        1. Reading from ARGF as a Stream
        2. Modifying Files
        3. Manipulating ARGV
        4. Wrapping Up
      5. Chapter 5: Delimited Data
        1. Parsing a TSV
        2. Delimited Data and the Command Line
        3. The CSV Format
        4. Wrapping Up
      6. Chapter 6: Scraping HTML
        1. The Right Tool for the Job: Nokogiri
        2. Searching the Document
        3. Working with Elements
        4. Exploring a Page
        5. Example: Reading a League Table
        6. Wrapping Up
      7. Chapter 7: Encodings
        1. A Brief Introduction to Character Encodings
        2. Ruby’s Support for Character Encodings
        3. Detecting Encodings
        4. Wrapping Up
    7. Part 2: Transform: Modifying and Manipulating Text
      1. Chapter 8: Regular Expressions Basics
        1. A Gentle Introduction
        2. Pattern Syntax
        3. Regular Expressions in Ruby
        4. Wrapping Up
      2. Chapter 9: Extraction and Substitution with Regular Expressions
        1. Matching Against Patterns
        2. Global Match Variables
        3. Extracting Multiple Matches
        4. Transforming Text
        5. Wrapping Up
      3. Chapter 10: Writing Parsers
        1. Simple Parsers with StringScanner
        2. Example: Parsing a Config File
        3. Rule-Based Parsers
        4. Example: Parsing RTF Files
        5. Wrapping Up
      4. Chapter 11: Natural Language Processing
        1. What Is Natural Language Processing?
        2. Example: Extracting Keywords from Articles
        3. Example: Fuzzy Searching
        4. Wrapping Up
    8. Part 3: Load: Writing Text
      1. Chapter 12: Standard Output and Standard Error
        1. Simple Output
        2. Formatting Output with printf
        3. Redirecting Standard Output
        4. Wrapping Up
      2. Chapter 13: Writing to Other Processes and to Files
        1. Writing to Other Processes
        2. Writing to Files
        3. Temporary Files
        4. Wrapping Up
      3. Chapter 14: Serialization and Structure: JSON, XML, CSV
        1. JSON
        2. XML
        3. CSV
        4. Wrapping Up
      4. Chapter 15: Templating Output with ERB
        1. Writing Templates
        2. Example: Generating a Purchase Ledger
        3. Evaluating Templates
        4. Passing Data to Templates
        5. Controlling Presentation with Decorators
        6. Wrapping Up
    9. Part 4: Appendices
      1. Appendix 1: A Shell Primer
        1. Running Commands
        2. Controlling Output
        3. Exit Statuses and Flow Control
      2. Appendix 2: Useful Shell Commands
    10. You May Be Interested I n…