4.10. Finding the nth Instance of a Substring

Problem

Given two strings source and pattern, you want to find the nth occurrence of pattern in source.

Solution

Use the find member function to locate successive instances of the substring you are looking for. Example 4-17 contains a simple nthSubstr function.

Example 4-17. Locate the nth version of a substring

#include <string>
#include <iostream>

using namespace std;

int nthSubstr(int n, const string& s,
              const string& p) {
   string::size_type i = s.find(p);     // Find the first occurrence

   int j;
   for (j = 1; j < n && i != string::npos; ++j)
      i = s.find(p, i+1); // Find the next occurrence

   if (j == n)
     return(i);
   else
     return(-1);
}

int main() {
   string s = "the wind, the sea, the sky, the trees";
   string p = "the";

   cout << nthSubstr(1, s, p) << '\n';
   cout << nthSubstr(2, s, p) << '\n';
   cout << nthSubstr(5, s, p) << '\n';
}

Discussion

There are a couple of improvements you can make to nthSubstr as it is presented in Example 4-17. First, you can make it generic by making it a function template instead of an ordinary function. Second, you can add a parameter to account for substrings that may or may not overlap with themselves. By “overlap,” I mean that the beginning of the string matches part of the end of the same string, as in the word “abracadabra,” where the last four characters are the same as the first four. Example 4-18 demonstrates this.

Example 4-18. An improved version of nthSubstr

#include <string>
#include <iostream>

using namespace std;

template<typename T>
int nthSubstrg(int n, const basic_string<T>& s,
               const basic_string<T>& p,
               bool repeats = false) {
   string::size_type i = s.find(p);
   string::size_type adv = (repeats) ? 1 : p.length();

   int j;
   for (j = 1; j < n && i != basic_string<T>::npos; ++j)
      i = s.find(p, i+adv);

   if (j == n)
     return(i);
   else
     return(-1);
}

int main() {
   string s = "AGATGCCATATATATACGATATCCTTA";
   string p = "ATAT";

   cout << p << " as non-repeating occurs at "
        << nthSubstrg(3, s, p) << '\n';
   cout << p << " as repeating occurs at "
        << nthSubstrg(3, s, p, true) << '\n';
}

The output for the strings in Example 4-18 is as follows:

ATAT as non-repeating occurs at 18
ATAT as repeating occurs at 11

See Also

Recipe 4.9

Get C++ Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.