How to Split a String in SQL Server?
Overview
Effectively handling strings that contain multiple values separated by delimiters is a frequent hurdle in database management. Whether it involves managing article tags or segmenting user inputs, the need for efficient data parsing is evident.
SQL Server's STRING_SPLIT function steps in as a robust solution, streamlining string manipulation. In this article, we'll explore its functionality, syntax, parameters, and real-world applications.
What is SQL Server STRING_SPLIT Function?
The SQL Server STRING_SPLIT function is a powerful tool that allows you to break down a string into smaller parts using a designated separator. This can be especially handy when you need to split a sentence into individual words. You can imagine STRING_SPLIT as a tool that transforms a single string into a table with rows, each representing a substring separated by the specified separator.
To make use of this function, you can simply provide the string you want to divide as the first argument and the separator as the second argument, typically using a space (' ') to split words in a sentence. When you use STRING_SPLIT, it produces a result in the form of a column named "value." If you want each part of the string to appear in a separate row, you can achieve this by selecting "value" from the STRING_SPLIT function.
Note: The order of the values in the "value" column may not necessarily follow a specific sequence; they can appear in any order.
Syntax of SQL Server STRING_SPLIT Function
The SQL Server STRING_SPLIT function provides two different syntaxes to accommodate varying user requirements. This allows for flexibility in how the function can be utilized.
Here's the syntax of the STRING_SPLIT() function:
Here's another syntax of the SQL Server STRING_SPLIT function with the optional ordinal parameter:
In the syntax, square brackets [] are used to denote optional parameters. When using the enable_ordinal parameter, the square brackets indicate that it is not required for basic functionality but can be included for specialized operations.
It's important to note that when using the enable_ordinal parameter, you have the option to include or omit the square brackets around it. Both STRING_SPLIT(input_string, separator, enable_ordinal) and STRING_SPLIT(input_string, separator [, enable_ordinal]) are valid forms.
Parameters of SQL Server STRING_SPLIT Function
- input_string: It represents a character-based expression, which should be of the data types NVARCHAR, VARCHAR, NCHAR, or CHAR.
- separator: It's a single character used to show where the splitting happens. Commonly used separators include characters like commas (,), spaces ( ), hyphens (-), or any other character relevant to your data. This parameter essentially defines the boundary for segmenting the input string into substrings.
- enable_ordinal: It is an optional parameter. If you choose to include it and set it to 1, the function will return an additional column named "ordinal" that contains 1-based index values for each substring's position within the input string. Bigint would be the data type of this column allowing the handling of large numerical values.
Return Types
When you use STRING_SPLIT without enabling the ordinal output column, it provides you with a table having a single column called "value." Each row in this column represents a substring obtained after the input string is split using the designated separator.
However, if you set the enable_ordinal parameter to 1, the function adds a second column named "ordinal" to the result table. This new column holds 1-based index values that indicate the position of each substring within the input string. This additional information is immensely valuable for tasks that require precise identification and manipulation of individual substrings.
By incorporating the "ordinal" column, you gain the ability to perform custom sorting and ordering of substrings.
SQL Server STRING_SPLIT() Function Examples
Using the STRING_SPLIT() Function to Split Comma-separated Value String
In this example, the STRING_SPLIT() function is applied to dissect a list of values separated by commas. Here's a sample query that extracts the individual elements:
Output:
Value |
---|
apple |
banana |
(blank) |
cherry |
The empty entry in the output of the STRING_SPLIT function is due to the presence of consecutive separators (commas) in the input string without any content between them. As a result, it generates an empty string as one of the split substrings, because it considers the portion between the first comma and the second comma (, and ,) as an empty value.
Using STRING_SPLIT() Function to Split a Comma-separated String in a Column
In certain situations, database tables may lack normalization. An illustrative instance of this occurs when a column can hold multiple values separated by commas (,).
The STRING_SPLIT() function proves invaluable in normalizing such data by dividing these multi-valued columns into separate entities. Let's see an example to understand this:
Suppose we have a table named "products" with the following structure:
In this table, the "categories" column contains multiple categories for each product, separated by commas. Now, we want to normalize this data by splitting the categories into separate rows using STRING_SPLIT(). Here's the query to achieve that:
Output:
id | name | category |
---|---|---|
1 | Laptop | Electronics |
1 | Laptop | Computers |
2 | TV | Electronics |
2 | TV | Entertainment |
3 | Book | Books |
4 | Headphones | Electronics |
4 | Headphones | Audio |
In the output, the STRING_SPLIT() function has split the categories for each product into separate rows, effectively normalizing the data.
Using the STRING_SPLIT() Function with an Aggregate Function
Let's use our table named "products" that we created above with the following structure to understand the aggregate function:
In this table, the "tags" column contains multiple tags for each product, separated by commas. Now, we want to count how many products have each tag. Here's the query to achieve that using the STRING_SPLIT() function:
Output:
tag | product_count |
---|---|
Electronics | 4 |
Computers | 1 |
Entertainment | 1 |
Books | 1 |
Audio | 1 |
Photography | 1 |
In the output, we used STRING_SPLIT() to split the tags for each product and then counted how many products have each tag.
Using the STRING_SPLIT() Function to Find Rows by Ordinal Values
Here is another example demonstrating the concept of using the STRING_SPLIT() function to work with even index values. In this case, we're retrieving all rows with even index positions:
Output:
value | ordinal |
---|---|
New York | 2 |
Chicago | 6 |
In the output, we are using STRING_SPLIT() to split a comma-separated list of cities and states. We then filter the rows to select only those with even index values (2 and 6).
Using the STRING_SPLIT() Function to Order Rows by Ordinal Values
Here's an example that demonstrates the concept of using STRING_SPLIT() to obtain split substring values along with their ordinal values.
Output:
value | ordinal |
---|---|
8 | 8 |
7 | 7 |
6 | 6 |
5 | 5 |
4 | 4 |
3 | 3 |
2 | 2 |
1 | 1 |
Conclusion
- SQL Server's STRING_SPLIT function helps normalize data with multiple values in a single column, separated by delimiters.
- Its syntax requires only two parameters: the input string and the separator. Optionally, an ordinal parameter can be used for index-based operations.
- STRING_SPLIT can be used for various tasks, such as splitting sentences into words or handling complex data structures like tags and categories.
- Enabling the ordinal parameter allows you to work with index values of split substrings.
- STRING_SPLIT can be combined with other SQL operations, making it useful for tasks like counting, grouping, and analyzing data effectively.
- Whether you're extracting individual words from sentences or managing complex data structures, SQL Server's STRING_SPLIT simplifies the process.
- By providing the ability to enable the "ordinal" parameter, SQL Server allows for advanced operations that rely on the positional information of substrings.