Tokenizing

PHP allows for a simple model for tokenizing a string. Certain characters, of your choice, are considered separators. Strings of characters between separators are considered tokens. You may change the set of separators with each token you pull from a string, which is handy for irregular strings—that is, ones that aren't simply comma-separated lists.

Listing 16.1 accepts a sentence and breaks it into words using the strtok function, described in Chapter 9, "Data Functions." As far as the script is concerned, a word is surrounded by a space, punctuation, or either end of the sentence. Single and double quotes are left as part of the word.

Listing 16.1. Tokenizing a String
						
							
								
							
						
					

Notice the addition of <END> to the input variable. This special ...

Get Core PHP Programming: Using PHP to Build Dynamic Web Sites now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.