13.9. Reading Records with a Pattern Separator

Problem

You want to read in records from a file, in which each record is separated by a pattern you can match with a regular expression.

Solution

Read the entire file into a string and then split on the regular expression:

$filename = '/path/to/your/file.txt';
$fh = fopen($filename, 'r') or die($php_errormsg);
$contents = fread($fh, filesize($filename));
fclose($fh);

$records = preg_split('/[0-9]+\) /', $contents);

Discussion

This breaks apart a numbered list and places the individual list items into array elements. So, if you have a list like this:

1) Gödel 
2) Escher
3) Bach

You end up with a four-element array, with an empty opening element. That’s because preg_split( ) assumes the delimiters are between items, but in this case, the numbers are before items:

               Array
               (
                   [0] => 
                   [1] => Gödel
                   [2] => Escher
                   [3] => Bach
               )

From one point of view, this can be a feature, not a bug, since the nth element holds the nth item. But, to compact the array, you can eliminate the first element:

$records = preg_split('/[0-9]+\) /', $contents);
array_shift($records);

Another modification you might want is to strip new lines from the elements and substitute the empty string instead:

$records = preg_split('/[0-9]+\) /', str_replace("\n",'',$contents));
array_shift($records);

PHP doesn’t allow you to change the input record separator to anything other than a newline, so this technique is also useful for breaking apart records divided by strings. However, if you ...

Get PHP Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.