2.6. The Poor Man’s Tokenizer

Problem

You need a quick method of breaking up a string into a series of discrete tokens or words.

Solution

Use the Split instance method of the string class. For example:

string equation = "1 + 2 - 4 * 5";
string[] equationTokens = equation.Split(new char[1]{' '});

foreach (string Tok in equationTokens)
   Console.WriteLine(Tok);

This code produces the following output:

1
+
2
-
4
*
5

The Split method may also be used to separate people’s first, middle, and last names. For example:

string fullName1 = "John Doe";
string fullName2 = "Doe,John";
string fullName3 = "John Q. Doe";

string[] nameTokens1 = fullName1.Split(new char[3]{' ', ',', '.'});
string[] nameTokens2 = fullName2.Split(new char[3]{' ', ',', '.'});
string[] nameTokens3 = fullName3.Split(new char[3]{' ', ',', '.'});

foreach (string tok in nameTokens1)
{
   Console.WriteLine(tok);
}
Console.WriteLine("");

foreach (string tok in nameTokens2)
{
   Console.WriteLine(tok);
}
Console.WriteLine("");

foreach (string tok in nameTokens3)
{
   Console.WriteLine(tok);
}

This code produces the following output:

John
Doe

Doe
John

John
Q

Doe

Notice that a blank is inserted between the '.' and the space delimiters of the fullName3 name; this is correct behavior. If you did not want to process this space in your code, you can choose to ignore it.

Discussion

If you have a consistent string whose parts, or tokens, are separated by well-defined characters, the Split function can tokenize the string. Tokenizing a string consists ...

Get C# Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.