2.6. The Poor Man’s Tokenizer
Problem
You need a quick method of breaking up a string into a series of discrete tokens or words.
Solution
Use the
Split
instance method of the
string
class. For example:
string equation = "1 + 2 - 4 * 5"; string[] equationTokens = equation.Split(new char[1]{' '}); foreach (string Tok in equationTokens) Console.WriteLine(Tok);
This code produces the following output:
1 + 2 - 4 * 5
The Split
method may also be used to separate
people’s first, middle, and last names. For example:
string fullName1 = "John Doe"; string fullName2 = "Doe,John"; string fullName3 = "John Q. Doe"; string[] nameTokens1 = fullName1.Split(new char[3]{' ', ',', '.'}); string[] nameTokens2 = fullName2.Split(new char[3]{' ', ',', '.'}); string[] nameTokens3 = fullName3.Split(new char[3]{' ', ',', '.'}); foreach (string tok in nameTokens1) { Console.WriteLine(tok); } Console.WriteLine(""); foreach (string tok in nameTokens2) { Console.WriteLine(tok); } Console.WriteLine(""); foreach (string tok in nameTokens3) { Console.WriteLine(tok); }
This code produces the following output:
John Doe Doe John John Q Doe
Notice that a blank is inserted between the '.
'
and the space delimiters of the fullName3
name;
this is correct behavior. If you did not want to process this space
in your code, you can choose to ignore it.
Discussion
If you have a consistent string whose parts, or
tokens, are separated by well-defined
characters, the Split
function can tokenize the string. Tokenizing a string consists ...
Get C# Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.