flex & bison

Book description

If you need to parse or process text data in Linux or Unix, this useful book explains how to use flex and bison to solve your problems quickly. flex & bison is the long-awaited sequel to the classic O'Reilly book, lex & yacc. In the nearly two decades since the original book was published, the flex and bison utilities have proven to be more reliable and more powerful than the original Unix tools.

flex & bison covers the same core functionality vital to Linux and Unix program development, along with several important new topics. You'll find revised tutorials for novices and references for advanced users, as well as an explanation of each utility's basic usage and simple, standalone applications you can create with them. With flex & bison, you'll discover the wide range of uses these flexible tools offer.

  • Address syntax crunching that regular expressions tools can't handle
  • Build compilers and interpreters, and handle a wide range of text processing functions
  • Interpret code, configuration files, or any other structured format
  • Learn key programming techniques, including abstract syntax trees and symbol tables
  • Implement a full SQL grammar-with complete sample code
  • Use new features such as pure (reentrant) lexers and parsers, powerful GLR parsers, and interfaces to C++

Publisher resources

View/Submit Errata

Table of contents

  1. flex & bison
  2. Preface
    1. Scope of This Book
    2. Conventions Used in This Book
    3. Getting Flex and Bison
    4. This Book’s Example Files
    5. Using Code Examples
    6. Safari® Books Online
    7. How to Contact Us
    8. Acknowledgments
  3. 1. Introducing Flex and Bison
    1. Lexical Analysis and Parsing
    2. Regular Expressions and Scanning
      1. Our First Flex Program
      2. Programs in Plain Flex
      3. Putting Flex and Bison Together
      4. The Scanner as Coroutine
      5. Tokens and Values
    3. Grammars and Parsing
      1. BNF Grammars
      2. Bison’s Rule Input Language
      3. Compiling Flex and Bison Programs Together
    4. Ambiguous Grammars: Not Quite
    5. Adding a Few More Rules
    6. Flex and Bison vs. Handwritten Scanners and Parsers
    7. Exercises
  4. 2. Using Flex
    1. Regular Expressions
      1. Regular Expression Examples
      2. How Flex Handles Ambiguous Patterns
      3. Context-Dependent Tokens
    2. File I/O in Flex Scanners
    3. Reading Several Files
    4. The I/O Structure of a Flex Scanner
      1. Input to a Flex Scanner
      2. Flex Scanner Output
    5. Start States and Nested Input Files
    6. Symbol Tables and a Concordance Generator
      1. Managing Symbol Tables
      2. Using a Symbol Table
    7. C Language Cross-Reference
    8. Exercises
  5. 3. Using Bison
    1. How a Bison Parser Matches Its Input
    2. Shift/Reduce Parsing
      1. What Bison’s LALR(1) Parser Cannot Parse
    3. A Bison Parser
    4. Abstract Syntax Trees
    5. An Improved Calculator That Creates ASTs
      1. Literal Character Tokens
      2. Building the AST Calculator
    6. Shift/Reduce Conflicts and Operator Precedence
      1. When Not to Use Precedence Rules
    7. An Advanced Calculator
      1. Advanced Calculator Parser
      2. Calculator Statement Syntax
      3. Calculator Expression Syntax
      4. Top-Level Calculator Grammar
      5. Basic Parser Error Recovery
      6. The Advanced Calculator Lexer
      7. Reserved Words
      8. Building and Interpreting ASTs
      9. Evaluating Functions in the Calculator
      10. User-Defined Functions
    8. Using the Advanced Calculator
    9. Exercises
  6. 4. Parsing SQL
    1. A Quick Overview of SQL
      1. Relational Databases
    2. Manipulating Relations
    3. Three Ways to Use SQL
    4. SQL to RPN
    5. The Lexer
      1. Scanning SQL Keywords
      2. Scanning Numbers
      3. Scanning Operators and Punctuation
      4. Scanning Functions and Names
      5. Comments and Miscellany
    6. The Parser
      1. The Top-Level Parsing Rules
      2. SQL Expressions
        1. Functions
        2. Other expressions
      3. Select Statements
        1. Select options and table references
        2. SELECT table references
      4. Delete Statement
      5. Insert and Replace Statements
        1. Replace statement
      6. Update Statement
      7. Create Database
      8. Create Table
      9. User Variables
      10. The Parser Routines
    7. The Makefile for the SQL Parser
    8. Exercises
  7. 5. A Reference for Flex Specifications
    1. Structure of a Flex Specification
      1. Definition Section
      2. Rules Section
      3. User Subroutines
    2. BEGIN
    3. C++ Scanners
    4. Context Sensitivity
      1. Left Context
      2. Right Context
    5. Definitions (Substitutions)
    6. ECHO
    7. Input Management
      1. Stdio File Chaining
      2. Input Buffers
      3. Input from Strings
      4. File Nesting
      5. input()
      6. YY_INPUT
    8. Flex Library
    9. Interactive and Batch Scanners
    10. Line Numbers and yylineno
    11. Literal Block
    12. Multiple Lexers in One Program
      1. Combined Lexers
      2. Multiple Lexers
    13. Options When Building a Scanner
    14. Portability of Flex Lexers
      1. Porting Generated C Lexers
        1. Buffer sizes
        2. Character sets
    15. Reentrant Scanners
      1. Extra Data for Reentrant Scanners
      2. Access to Reentrant Scanner Data
      3. Reentrant Scanners, Nested Files, and Multiple Scanners
      4. Using Reentrant Scanners with Bison
    16. Regular Expression Syntax
      1. Metacharacters
    17. REJECT
    18. Returning Values from yylex()
    19. Start States
    20. unput()
    21. yyinput() yyunput()
    22. yyleng
    23. yyless()
    24. yylex() and YY_DECL
    25. yymore()
    26. yyrestart()
    27. yy_scan_string and yy_scan_buffer
    28. YY_USER_ACTION
    29. yywrap()
  8. 6. A Reference for Bison Specifications
    1. Structure of a Bison Grammar
      1. Symbols
      2. Definition Section
      3. Rules Section
      4. User Subroutines Section
    2. Actions
      1. Embedded Actions
      2. Symbol Types for Embedded Actions
    3. Ambiguity and Conflicts
      1. Types of Conflicts
      2. Shift/Reduce Conflicts
      3. Reduce/Reduce Conflicts
      4. %expect
      5. GLR Parsers
    4. Bugs in Bison Programs
      1. Infinite Recursion
      2. Interchanging Precedence
      3. Embedded Actions
    5. C++ Parsers
    6. %code Blocks
    7. End Marker
    8. Error Token and Error Recovery
      1. %destructor
    9. Inherited Attributes ($0)
      1. Symbol Types for Inherited Attributes
    10. %initial-action
    11. Lexical Feedback
    12. Literal Block
    13. Literal Tokens
    14. Locations
    15. %parse-param
    16. Portability of Bison Parsers
      1. Porting Bison Grammars
      2. Porting Generated C Parsers
      3. Libraries
      4. Character Codes
    17. Precedence and Associativity Declarations
      1. Precedence
      2. Associativity
      3. Precedence Declarations
      4. Using Precedence and Associativity to Resolve Conflicts
      5. Typical Uses of Precedence
    18. Recursive Rules
      1. Left and Right Recursion
    19. Rules
    20. Special Characters
    21. %start Declaration
    22. Symbol Values
      1. Declaring Symbol Types
      2. Explicit Symbol Types
    23. Tokens
      1. Token Numbers
      2. Token Values
      3. %type Declaration
      4. %union Declaration
    24. Variant and Multiple Grammars
      1. Combined Parsers
    25. Multiple Parsers
      1. Using %name-prefix or the -p Flag
      2. Lexers for Multiple Parsers
      3. Pure Parsers
    26. y.output Files
    27. Bison Library
      1. main()
      2. yyerror()
    28. YYABORT
    29. YYACCEPT
    30. YYBACKUP
    31. yyclearin
    32. yydebug and YYDEBUG
      1. YYDEBUG
      2. yydebug
    33. yyerrok
    34. YYERROR
    35. yyerror()
    36. yyparse()
    37. YYRECOVERING()
  9. 7. Ambiguities and Conflicts
    1. The Pointer Model and Conflicts
    2. Kinds of Conflicts
    3. Parser States
    4. Contents of name.output
    5. Reduce/Reduce Conflicts
    6. Shift/Reduce Conflicts
    7. Review of Conflicts in name.output
    8. Common Examples of Conflicts
      1. Expression Grammars
      2. IF/THEN/ELSE
      3. Nested List Grammar
    9. How Do You Fix the Conflict?
      1. IF/THEN/ELSE (Shift/Reduce)
      2. Loop Within a Loop (Shift/Reduce)
      3. Expression Precedence (Shift/Reduce)
      4. Limited Lookahead (Shift/Reduce or Reduce/Reduce)
      5. Overlap of Alternatives (Reduce/Reduce)
    10. Summary
    11. Exercises
  10. 8. Error Reporting and Recovery
    1. Error Reporting
    2. Locations
      1. Adding Locations to the Parser
      2. Adding Locations to the Lexer
      3. More Sophisticated Locations with Filenames
    3. Error Recovery
    4. Bison Error Recovery
      1. Freeing Discarded Symbols
      2. Error Recovery in Interactive Parsers
      3. Where to Put Error Tokens
    5. Compiler Error Recovery
    6. Exercises
  11. 9. Advanced Flex and Bison
    1. Pure Scanners and Parsers
      1. Pure Scanners in Flex
      2. Pure Parsers in Bison
      3. Using Pure Scanners and Parsers Together
      4. A Reentrant Calculator
    2. GLR Parsing
      1. GLR Version of the SQL Parser
    3. C++ Parsers
      1. A C++ Calculator
      2. C++ Parser Naming
      3. A C++ Parser
      4. Interfacing a Scanner with a C++ Parser
      5. Should You Write Your Parser in C++ ?
    4. Exercises
  12. A. SQL Parser Grammar and Cross-Reference
  13. Glossary
  14. Index
  15. About the Author
  16. Colophon
  17. Copyright

Product information

  • Title: flex & bison
  • Author(s): John Levine
  • Release date: August 2009
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781449379278