Printing Lines Containing a Pattern

Problem

You need to look for lines matching a given RE in one or more files.

Solution

As I’ve mentioned, once you have an RE package, you can write the grep program. I gave an example of the Unix grep program earlier. grep is called with some optional arguments, followed by one required regular expression pattern, followed by an arbitrary number of filenames. It prints any line that contains the pattern, differing from Section 4.8, which only prints the matching text itself. For example:

grep "[dD]arwin" *.txt

searches for lines containing either “darwin” or “Darwin” on any line in any file whose name ends in “.txt”.[18] Example 4-1 is the source for the first version of a program to do this, called Grep1. It doesn’t yet take any optional arguments, but it handles the full set of regular expressions that the RE class implements. We haven’t covered the java.io package for input and output yet (see Chapter 9), but our use of it here is simple enough that you can probably intuit it. Later in this chapter, Section 4.14 presents a Grep2 program that uses my GetOpt (see Section 2.8) to parse command-line options.

import org.apache.regexp.*;
import java.io.*;

/** A command-line grep-like program. No options, but takes a pattern
 * and an arbitrary list of text files.
 */
public class Grep1 {
    /** The pattern we're looking for */
    protected RE pattern;
    /** The Reader for the current file */
    protected BufferedReader d;

    /** Construct a Grep object for each pattern, and run it
     * on all input files listed in argv.
     */
    public static void main(String[] argv) throws Exception {

        if (argv.length < 1) {
            System.err.println("Usage: Grep pattern [filename]");
            System.exit(1);
        }

        Grep1 pg = new Grep1(argv[0]);

        if (argv.length == 1)
            pg.process(new InputStreamReader(System.in), 
                "(standard input", false);
        else
            for (int i=1; i<argv.length; i++) {
                pg.process(new FileReader(argv[i]), argv[i], true);
            }
    }

    public Grep1(String arg) throws RESyntaxException {
        // compile the regular expression
        pattern = new RE(arg);
    }
        
    /** Do the work of scanning one file
     * @param patt RE Regular Expression object
     * @param ifile Reader Reader object already open
     * @param fileName String Name of the input file
     * @param printFileName Boolean - true to print filename
     * before lines that match.
     */
    public void process(
        Reader ifile, String fileName, boolean printFileName) {

        String line;

        try {
            d = new BufferedReader(ifile);
            
            while ((line = d.readLine(  )) != null) {
                if (pattern.match(line)) {
                    if (printFileName)
                        System.out.print(fileName + ": ");
                    System.out.println(line);
                }
            }
            d.close(  );
        } catch (IOException e) { System.err.println(e); }
    }
}


[18] On Unix, the shell or command-line interpreter expands *.txt to match all the filenames, but the normal Java interpreter does this for you on systems where the shell isn’t energetic or bright enough to do it.

Get Java Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.