Compare strings to answer one of two questions: “Are these two strings equal?” or “
In what order should these strings be placed when sorting?”
Those two questions are complicated by factors that affect
string comparisons: You can choose
- an ordinal or linguistic comparison
- whether the case matters. You can choose
- Language comparisons depend on culture and platform.
. You can choose
culture-specific comparisons.
When you compare strings, you define an order between them. Comparisons are used to sort a sequence of strings. Once the sequence is in a known order, it’s easier to search, both for software and humans. Other comparisons can check if the strings are the same. These similarity checks are similar to equality, but some differences, such as case differences, can be ignored.
Default ordinal comparisons
By default, the most common operations:
- String.Equals
- String.Equality and String.Inequality, that is, the equality == and != operators, respectively,
perform a case-sensitive
ordinal
comparison. For String.Equals, you can provide a StringComparison argument to modify its sort rules. The following example shows that
: string root = @”C:\users”; string root2 = @”C:\Users”; Result bool = root. Same (root2); Console.WriteLine($”Ordinal comparison: <{root}> and <{root2}> are {(result ? “equal.” : “not equal.”)}”); result = root. Equals(root2, StringComparison.Ordinal); Console.WriteLine($”Ordinal comparison: <{root}> and <{root2}> are {(result ? “equal.” : “not equal.”)}”); Console.WriteLine($”Using == says that <{root}> and <{root2}> are {(root == root2 ? “equal” : “not equal”)}”);
The default ordinal comparison does not take linguistic rules into account when comparing strings. Compares the binary value of each Char object in two strings. As a result, the default ordinal comparison is also case-sensitive.
The equality test using String.Equals and the == and != operators differs from comparing strings using the String.CompareTo and Compare(String, String) methods. They all perform a case-sensitive comparison. However, while equality tests perform an ordinal comparison, the CompareTo and Compare methods perform a culture-conscious linguistic comparison using current culture. Because these default comparison methods differ in how they compare strings, we recommend that you always clarify the intent of your code by calling an overload that explicitly specifies the type of comparison to perform.
Case-insensitive
ordinal
comparisons The String.Equals(String, StringComparison) method lets you specify a StringComparison value of StringComparison.OrdinalIgnoreCase for a case-insensitive ordinal comparison. There is also a static String.Compare(String, String, StringComparison) method that performs a case-insensitive ordinal comparison if you specify a value of StringComparison.OrdinalIgnoreCase for the StringComparison argument. These are shown in the following code
: string root = @”C:\users”; string root2 = @”C:\Users”; Result bool = root. Equals(root2, StringComparison.OrdinalIgnoreCase); bool areEqual = String.Equals(root, root2, StringComparison.OrdinalIgnoreCase); int comparison = String.Compare(root, root2, comparisonType: StringComparison.OrdinalIgnoreCase); Console.WriteLine($”Ordinal ignore case: <{root}> and <{root2}> are {(result ? “equal.” : “not equal.”)}”); Console.WriteLine($”Ordinal static ignore case: <{root}> and <{root2}> are {(areEqual ? “equal.” : “not equal.”)}”); if (comparison < 0) Console.WriteLine($”<{root}> is less than <{root2}>”); else if (comparison > 0) Console.WriteLine($”<{root}> is greater than <{root2}>”); else Console.WriteLine($”<{root}> and <{root2}> are equivalent in order”);
When performing a case-insensitive ordinal comparison, these methods use the case-invariant culture conventions.
Strings
can also be sorted using linguistic rules for today’s culture. This is sometimes called “word sorting order.” When performing a linguistic comparison, some non-alphanumeric Unicode characters may have special weights assigned to them. For example, the hyphen “-” can have a small weight assigned so that “co-op” and “coop” appear side by side in sort order. In addition, some Unicode characters can be equivalent to a sequence of Char instances. The following example uses the phrase “Dance in the street.” in German with “ss” (U+0073 U+0073) on one string and ‘ß’ (U+00DF) on another. Linguistically (on Windows), “ss” is equal to the German character Esszet: ‘ß’ in the cultures “en-US” and “de-DE”.
string first = “Sie tanzen auf der Straße.”; string second = “Sie tanzen auf der Strasse.”; Console.WriteLine($”The first sentence is <{first}>”); Console.WriteLine($”The second sentence is <{second}>”); bool equal = String.Equals(first, second, StringComparison.InvariantCulture); Console.WriteLine($”The two strings {(equal == true ? “are” : “are not”)} equal.”); showComparison(first, second); string word = “coop”; string words = “co-op”; string other = “cop”; showComparison(word, words); showComparison(word, other); showComparison(words, others); void showComparison(string one, string two) { int compareLinguistic = String.Compare(one, two, StringComparison.InvariantCulture); int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal); if (compareLinguistic < 0) Console.WriteLine($”<{one}> is less than <{two}> using invariant culture”); else if (compareLinguistic > 0) Console.WriteLine($”<{one}> is greater than <{two}> using invariant culture”); else Console.WriteLine($”<{one}> and < {two}> are equivalent in order using the invariant culture”); if (compareOrdinal < 0) Console.WriteLine($”<{one}> is less than <{two}> using ordinal comparison”); else if (compareOrdinal > 0) Console.WriteLine($”<{one}> is greater than <{two}> using ordinal comparison”); else Console.WriteLine($”<{one}> and <{two}> are equivalent in order using ordinal comparison”); }
On Windows, prior to .NET 5, the sort order of “cop”, “coop”, and “co-op” changes when you switch from a language comparison to an ordinal comparison. The two German sentences are also compared differently using the different types of comparison. This is because prior to .NET 5, the .NET globalization APIs used National Language Support Libraries (NLS). In .NET 5 and later, the .NET globalization APIs use international components for Unicode libraries (ICUs), which unify the globalization behavior of .NET across all supported operating systems.
Comparisons to
specific cultures
This example stores CultureInfo objects for the en-US and de-DE cultures. Comparisons are performed using a CultureInfo object to ensure a culture-specific comparison.
The culture used affects linguistic comparisons. The following example shows the results of comparing the two German sentences using the culture “en-US” and the culture “de-DE
“: string first = “Sie tanzen auf der Straße.”; string second = “Sie tanzen auf der Strasse.”; Console.WriteLine($”The first sentence is <{first}>”); Console.WriteLine($”The second sentence is <{second}>”); var en = new System.Globalization.CultureInfo(“en-US”); For culture-sensitive comparisons, use the String.Compare // overload that takes a StringComparison value. int i = String.Compare(first, second, en, System.Globalization.CompareOptions.None); Console.WriteLine($”Comparing in {en. Name} returns {i}.”); var de = new System.Globalization.CultureInfo(“de-DE”); i = String.Compare(first, second, de, System.Globalization.CompareOptions.None); Console.WriteLine($”Comparing in {de. Name} returns {i}.”); bool b = String.Equals(first, second, StringComparison.CurrentCulture); Console.WriteLine($”The two strings {(b ? “are” : “are not”)} equal.”); string word = “coop”; string words = “co-op”; string other = “cop”; showComparison(word, words, in); showComparison(word, other, en); showComparison(words, other, en); void showComparison(string one, string two, System.Globalization.CultureInfo culture) { int compareLinguistic = String.Compare(one, two, en, System.Globalization.CompareOptions.None); int compareOrdinal = String.Compare(one, two, StringComparison.Ordinal); if (compareLinguistic < 0) Console.WriteLine($”<{one}> is less than <{two}> using en-US culture”); else if (compareLinguistic > 0) Console.WriteLine($”<{one}> is greater than <{two}> using en-US culture”); else Console.WriteLine($”<{one}> and <{two}> are equivalent in order using en-US culture”); if (compareOrdinal < 0) Console.WriteLine($”<{one}> is less than <{two}> using ordinal comparison”); else if (compareOrdinal > 0) Console.WriteLine($”<{one}> is greater than <{two}> using ordinal comparison”); else Console.WriteLine($”<{one}> and <{two}> are equivalent in order using ordinal comparison”); }
Culture-sensitive comparisons are typically used to compare and sort strings entered by users with other strings entered by users. The characters and sorting conventions for these strings may vary depending on the locale of the user’s computer. Even strings that contain identical characters can be sorted differently based on the culture of the current thread.
Linguistic sorting and searching for strings in
arrays
The following examples show how to sort and find strings in an array using a culture-dependent linguistic comparison. Use static Array methods that take a System.StringComparer parameter.
This example shows how to sort an array of strings using the current culture
: string[] lines = new string[] { @”c:\public\textfile.txt”, @”c:\public\textFile.TXT”, @”c:\public\Text.txt”, @”c:\public\testfile2.txt” }; Console.WriteLine(“Unordered order:”); foreach (string s on lines) { Console.WriteLine($” {s}”); } Console.WriteLine(“\n\rOrder sorted:”); Specify Ordinal to demonstrate the different behavior. Array.Sort(lines, StringComparer.CurrentCulture); foreach (string s on lines) { Console.WriteLine($” {s}”); }
After you sort the array, you can search for entries using a binary search. A binary search begins in the center of the collection to determine which half of the collection would contain the searched string. Each subsequent comparison subdivides the remaining part of the collection in half. The array is sorted using StringComparer.CurrentCulture. The local ShowWhere function displays information about where the string was found. If the string was not found, the return value indicates where it would be if it were found.
string[] lines = new string[] { @”c:\public\textfile.txt”, @”c:\public\textFile.TXT”, @”c:\public\Text.txt”, @”c:\public\testfile2.txt” }; Array.Sort(lines, StringComparer.CurrentCulture); string searchString = @”c:\public\TEXTFILE.TXT”; Console.WriteLine($”Binary search for <{searchString}>”); int result = Array.BinarySearch(lines, searchString, StringComparer.CurrentCulture); ShowWhere<string>(lines, result); Console.WriteLine($”{(result > 0 ? “Found” : “I could not find”)} {searchString}”); void ShowWhere<T>(T[] array, int index) { if (index < 0) { index = ~index; Console.Write(“Not found. Sort between: “); if (index == 0) Console.Write(“beginning of sequence and “); else Console.Write($”{array[index – 1]} and “); if (index == array. Length) Console.WriteLine(“end of sequence.”); else Console.WriteLine($”{array[index]}.”); } else { Console.WriteLine($”Found at index {index}.”); } }
Ordinal sorting and searching in collections
The following code uses the System.Collections.Generic.List<T> collection class to store strings. Strings are sorted using List<T>. Sort method. This method requires a delegate who compares and sorts two strings. The String.CompareTo method provides that comparison function. Run the sample and observe the order. This sort operation uses a case-sensitive ordinal sort. You would use the static String.Compare methods to specify different comparison rules.
List<string> lines = new List<string> { @”c:\public\textfile.txt”, @”c:\public\textFile.TXT”, @”c:\public\Text.txt”, @”c:\public\testfile2.txt” }; Console.WriteLine(“Unordered order:”); foreach (string s on lines) { Console.WriteLine($” {s}”); } Console.WriteLine(“\n\rOrder sorted:”); Lines. Sort((left, right) => left. CompareTo(right)); foreach (string s on lines) { Console.WriteLine($” {s}”); }
Once sorted, the list of strings can be searched using a binary search. The following example shows how to search the ordered list using the same comparison function. The ShowWhere local function shows where the searched text is or would be
: List<string> lines = new List<string> { @”c:\public\textfile.txt”, @”c:\public\textFile.TXT”, @”c:\public\Text.txt”, @”c:\public\testfile2.txt” }; Lines. Sort((left, right) => left. CompareTo(right)); string searchString = @”c:\public\TEXTFILE.TXT”; Console.WriteLine($”Binary search for <{searchString}>”); int result = lines. BinarySearch(searchString); ShowWhere<string>(lines, result); Console.WriteLine($”{(result > 0 ? “Found” : “I could not find”)} {searchString}”); void ShowWhere<T>(IList<T> collection, int index) { if (index < 0) { index = ~index; Console.Write(“Not found. Sort between: “); if (index == 0) Console.Write(“beginning of sequence and “); else Console.Write($”{collection[index – 1]} and “); if (index == collection. Count) Console.WriteLine(“end of sequence.”); else Console.WriteLine($”{collection[index]}.”); } else { Console.WriteLine($”Found at index {index}.”); } }
Always be sure to use the same type of comparison for sorting and searching. Using different types of comparison to sort and search produces unexpected results.
Collection classes such as System.Collections.Hashtable, System.Collections.Generic.Dictionary<
TKey,TValue>, and System.Collections.Generic.List<T> have constructors that take a System.StringComparer parameter when the type of the elements or keys is string. In general, you should use these constructors whenever possible and specify StringComparer.Ordinal or StringComparer.OrdinalIgnoreCase.
Reference equality and string internment
None of the examples used ReferenceEquals. This method determines whether two strings are the same object, which can result in inconsistent results in string comparisons. The following example shows the C# string interning feature. When a program declares two or more identical string variables, the compiler stores them all in the same location. When you call the ReferenceEquals method, you can see that the two strings actually reference the same object in memory. Use the String.Copy method to bypass the internal. After copying, the two strings have different storage locations, even if they have the same value. Run the following example to show that strings a and b are internalized, meaning they share the same storage. The strings a and c are not.
string a = “The computer ate my source code.”; string b = “The computer ate my source code.”; if (String.ReferenceEquals(a, b)) Console.WriteLine(“a and b are interned.”); else Console.WriteLine(“a and b are not interned.”); string c = String.Copy(a); if (String.ReferenceEquals(a, c)) Console.WriteLine(“a and c are interned.”); else Console.WriteLine(“a and c are not interned.”);
You can internalize a string or retrieve a reference to an existing internal string by calling the String.Intern method. To determine whether a string is interned, call the String.IsInterned method.
See also
System.Globalization.CultureInfo
- System.StringComparer
- Strings String Comparison
- Application Globalization and Localization