C Program to Sort Elements in Lexicographical Order (Dictionary Order)
Overview
Lexicographic order is just dictionary order. To sort elements in lexicographical order we will use the selection sort algorithm. We traverse over the array and at each iteration try to find the smallest string in the succeeding array. The time complexity of this algorithm to sort elements in lexicographic order in C is and the space complexity is where n is the number of strings and m is the maximum length of the strings.
Introduction
When we open our contact list on the phone, all the contact names are sorted in a well-defined order, all the contact numbers are in alphabetical order or dictionary order. This makes finding any specific contact simple, one can search using a real-life binary search algorithm, imagine all the contact's names in your phone are not in alphabetical or any other specific order then it will lead to taking ages to find any specific contact number. Searching becomes easy if the things are sorted in a well-defined order.
As numbers can be sorted based on numeric values like ‘9’ will lie before ‘11’ if sorted in increasing order. Similarly, for the string, there is a rule that is dictionary order also known as lexicographical order. If a string appears after in the dictionary as compared to another string, then it is said to be lexicographical greater than the other string. To compare two strings we go character by character, at first mismatch less the ASCII value of the current character makes the string lexicographical smaller. Also, there may be conditions when we do not find any mismatch if one string is a prefix of another, then it will appear first in the dictionary. For example, “to” is lexicographically smaller as compared to “top”.
Note:
ASCII is a set of 128 characters, where every character is given a particular value, all the lowercase letters lie in the range 97 to 122, uppercase letters lie in the range 65 to 90, numeric digits in range 48 to 57 and other positions are reserved for special characters, like underscores, a hyphen, '+','-','&', etc.
C Program to Sort Elements in the Dictionary Order (Lexicographically)
Let’s take an example to sort given elements in lexicographical order in C. In this example, we will take 5 animal names and will try to sort them in lexicographical order, here we are assuming every name will be of a maximum length of 20, and only 5 names are present.
Output :
Before going for the discussion of the above code for sorting the elements in the lexicographic order in C, let’s discuss two functions that we have used in the above code.
- strcmp :
This function takes two string’s as arguments and compares them, if both the strings are equal, then it will return zero, if the first string is lexicographically greater than the second, then it will return a positive number which is a difference of the first miss-match, and if the first string is lexicographically smaller than second then it will return a negative number which is a difference of first miss-match. - strcpy :
This function takes two string objects as arguments and copies the second string in the first string including the last null/terminal character, while sorting in lexicographic order in C, we need to swap the elements in that procedure we need this function.
Program Explanation
In the program, we were trying to set the elements in lexicographic order in C, for that, we traversed over the given array, at each iteration, we are going to traverse the array after the current index. In the nested array, we will check using the strcmp() function if the current nested loop index string is smaller than our current index value then we will swap them using the temp character array and strcpy() function. At each iteration, we are going to find the smallest string in the remaining array and replace it with the current string. Let's look into our string's positions at each iteration:
Initially, we have given strings in order :
-
In the first pass, we will compare from index-1 to index-4 with index-0
- index-1 string (“cat”) is lexicographically smaller as compared to the current iteration index (index-0) string “pig”, so we will swap them
- As “cat” is now at index-0 and it is the smallest among all, so no swap will occur in this pass
-
In the second pass, we will compare from index-2 to index-4 with index-1
- index-2 string (“dog”) is lexicographically smaller as compared to the current iteration index (index-1) string “pig”, so we will swap them
- index-3 string (“cow”) is lexicographically smaller as compared to the current iteration index (index-1) string “dog”, so we will swap them
- As “cow” is now at index-1 and it is the smallest among all left after it, so no swap will occur in this pass
- index-2 string (“dog”) is lexicographically smaller as compared to the current iteration index (index-1) string “pig”, so we will swap them
-
In the third pass, we will compare from index-3 to index-4 with index-2
- index-3 string("dog") is lexicographically smaller as compared to the current iteration index (index-2) string "pig", so will swap them
- As “dog” is now at index-2 and it is the smallest among all left after it, so no swap will occur in this pass
-
In the fourth pass, we will compare from index-4 to index-4 with index-3
- As "pig" is smaller as compared to "tiger" in lexicographical order, there will be no swap occurring.
-
There is no string after index-4, so no comparison occurs, and the loop will end.
This is our final required order.
Time and Space Complexity
Let the number of strings be n and the length of strings be m.
Time complexity As we are using a nested for loop which means the total number of iterations is n * n and at each iteration, we are comparing strings that will take m iterations. So time complexity of this function is . The out algorithm will always run for n*n times to make a comparison for every possible case, but the cost of comparison may reduce to in some special cases, that is when the first character of all the given strings are different. This will reduce the time complexity to , which is our best case.
Space Complexity As we are only using a temp character array as external memory so space complexity will be . is the best, average, and worst time complexity for space.
Conclusion
- Sorting of the string is a process in which we arrange elements by looking at their dictionary order, either in increasing or decreasing order.
- The standard order of string sorting is called the lexicographic order. Lexicographical order is just dictionary order.
- strcpy() function is used to copy the right argument string and assign it to the left argument string.
- strcmp() function used for comparing the string lexicographically and achieving the lexicographical order in C.
- strcmp() function returns zero if strings are matching, returns less than zero if the left string is smaller than the right string, and returns greater than zero if the right string is smaller than the left string.
- The time complexity of the above program to sort elements in lexicographical order in C is and space complexity is where n is the number of strings and m is the maximum length of the strings.