O'Reilly logo

Python 2.6 Text Processing Beginner's Guide by Jeff McNeil

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Time for action - regular expression grouping

In this example, we'll return to our LogProcessing application. Here, we'll update our log split routines to divide lines up via a regular expression as opposed to simple string manipulation.

  1. In core.py, add an import re statement to the top of the file. This makes the regular expression engine available to us.
  2. Directly above the __init__ method definition for LogProcessor, add the following lines of code. These have been split to avoid wrapping.
    _re = re.compile(
    			r'^([\d.]+) (\S+) (\S+) \[([\w/:+ ]+)] "(.+?)" ' \
    			r'(?P<rcode>\d{3}) (\S+) "(\S+)" "(.+)"')
    
  3. Now, we're going to replace the split method with one that takes advantage of the new regular expression:
     def split(self, line): """ Split a logfile. ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required