How to Split a String in C++?

Learn via video course
FREE
View all courses
C++ Course: Learn the Essentials
C++ Course: Learn the Essentials
by Prateek Narang
1000
5
Start Learning
C++ Course: Learn the Essentials
C++ Course: Learn the Essentials
by Prateek Narang
1000
5
Start Learning
Topics Covered

When we split or divide a string, we divide a group of words or string collections into single words. However, splitting strings is only possible with specific delimiters such as white space( ), comma(,), hyphen(-), and so on, resulting in individual words.

Many programming languages include the split() function for dividing a string into multiple parts. There is no built-in split() function in C++ for splitting strings, but there are numerous ways to accomplish the same task, such as using the getline() function, strtok() function, find() and erase() functions, and so on. This article has explained how to use these functions to split strings in C++.

Various Methods to Split Strings in C++

  1. Using Temporary String
  2. Using stringstream API of C++
  3. Using strtok() Function
  4. Using Custom split() Function
  5. Using std::getline() Function
  6. Using find(), substr() and erase() Functions
  7. Using std::regex_token_iterator Function
  8. Using find_first_not_of() with find() Function

1. Using Temporary String

Let's take a program that uses a temporary string to split a given char into strings.

Input

Output

Explanation

In this program, we store each character of arr in the temporary string s until we find the separator(blank space in this case). Once we find the separator, we print the sting s and then use the clear() function to empty the string s. We repeat the process until we find \0.

2. Using stringstream API of C++

The stringstream objects can be initialized with a string object, and it tokenizes strings on space characters automatically. The stringstream, like the cin stream, allows you to read a string as a stream of words. The most commonly used stringstream operators are as follows:

  • Operator<<:- pushes a string object into the stream.
  • Operator>>:- extracts a word from the stream.

Note: Tokenizing a string means splitting it with respect to a delimiter.

Syntax

Program

Output

Explanation

In this program, we create a stringstream object ss and run a loop until we can extract word from the stream. In each loop iteration, we print the word extracted using the >> operator from the stream.

3. Using strtok() Function

The strtok() function divides the original string into chunks or tokens based on the delimiter passed. The strtok() function modifies the original string on each call by inserting a NULL character (\0) at the delimiter position. This allows it to track the status of the token easily.

Syntax

The strtok() function takes two parameters:

  1. string:- It is the original string to which we apply strtok() function to split.
  2. delimiter:- It is the parameter or character based on which we separate the string.
  3. Return:- It returns a pointer to the next character tokens. It initially points to the first token of the strings.

Program

Input

Output

Explanation

In this program, we create a char pointer p, which points to the character token returned by the strtok() function. After this, we run a loop until the char pointer is not equal to NULL. In each iteration, we print the pointer.

4. Using Custom split() Function

In this program, we customize a split function that accepts string and delimiter to split the given string.

Program

Output

Explanation

In this program, we aim to create a function that accepts two parameters, i.e., a string and a delimiter, based on which we split the input string. In this program, we run a loop and check for the delimiter(we use blank space). When we find a delimiter, we append that part of the string into a temporary string temp. After that, we update the startIndex and endIndex accordingly and push the temp string into the vector of the string.

5. Using std::getline() Function

The getline() function is a C++ standard library function that reads a string from an input stream and inserts it into a vector string until delimiter characters are found. By importing the <string> header file, we can use the std::getline() function.

Syntax

The getline() function takes three parameters;

  1. string:- It is the variable that stores the original string;
  2. token:- It saves the extracted string tokens from the original string.
  3. delimiter:- It is the parameter or character based on which we separate the string.

Program

Output

Explanation

In this program, we create two strings, str and s, and a stringstream object, ss. The stringstream object ss, the string str, and a delimiter(blank space) are passed as a parameter in the getline function, while string s is the input string. In each iteration of the loop, the getline() function returns a string in the str, and we print this string.

6. Using find(), substr(), and erase() Functions

The substr() is a predefined function in C++ used to handle string operations. This function accepts two arguments, pos, and len, and returns a newly constructed string object with its value set to a copy of a substring of this object. String copying begins with pos and continues until pos+len, which means [pos, pos+len).

Syntax

The substr() function takes two parameters:

  1. position:- It is the position of the first variable.
  2. length:- Length of the substring we want to generate.

Note: The substr() returns a string object and size_t is an unsigned integer type.

Program

Output

Explanation

In this program, we use the find() function to find the first occurrence of the delimiter in the input string s. After finding the position of the delimiter, we print the substring of string s up to the position of the delimiter. Now using the erase() function, we clear that substring from the string and set end as the next position of the delimiter for the next iteration. We run this loop until no delimiter is left in the string.

7. Using std::regex_token_iterator Function

The read-only LegacyForwardIterator std::regex_token_iterator accesses the individual sub-matches of every regular expression match within the underlying character sequence. It can also be used to access sequence segments that were not matched by the given regular expression (e.g., as a tokenizer). The std::sregex_token_iterator, which is a subclass of the std::regex token iterator<std::string::const iterator>, is used to split the string.

Note: A regular expression, also known as a regex, is an expression that contains a series of characters that define a specific search pattern that can be used in string searching algorithms, find or find/replace algorithms, and so on.

Program

Output

Explanation

In this program, we create a vector of strings v using the sregex_token_iterator() function. This vector stores the required split strings. Finally, we iterate the v vector using a for loop and print the strings in each iteration.

8. Using find_first_not_of() with find() Function

The find_first_not_of() function looks for the first character in the string that does not match the characters specified in its arguments. If successful, the index of the first unmatched character is returned, or string::npos is if no such character is found.

Syntax

The find_first_not_of() function takes two parameters.

  1. str:- Another string containing the characters to be used in the search.
  2. idx:- It is the index number from which we must look for the first unmatched character.

Note: The const after the member function indicates that the find_first_not_of() function is a constant member function, and in this member function, no data members are modified.

Program

Output

Explanation

In this program, we use the find_first_not_of() function to find the first position where the delimiter does not match a character in the string str. This position is treated as the start position of the substring and stored in the start variable. Now we use the find() function to find the first position of the delimiter. This position will act as an end index of the substring and be stored in the end variable. The difference between the start and end variables will give us the length of the required substring. Using the substr() function, we store the substring in vector v. Repeat this process until the start variable equals the string::npos.

Conclusion

  • There is no built-in split() function in C++ for splitting strings, but there are numerous ways to accomplish the same task, such as using the getline() function, strtok() function, find() and erase() functions, and so on.
  • The strtok() function divides the original string into chunks or tokens based on the delimiter passed.
  • The getline() function is a C++ standard library function that reads a string from an input stream and inserts it into a vector string until delimiter characters are found.
  • The substr() function returns a substring of a given string by accepting two parameters.
  • The read-only LegacyForwardIterator std::regex_token_iterator accesses the individual sub-matches of every regular expression match within the underlying character sequence.
  • The find_first_not_of() function looks for the first character in the string that does not match the characters specified in its arguments.
  • To split a string, using a temporary string is the best and simplest way of doing it. If the input string has length n, then:
    • Time complexity: O(n)O(n)
    • Auxilary Space: O(n)O(n)

Read Also