Understanding String Comparison in Python: Why 'hello' <= 'hi' is True
When working with strings in Python, you might encounter unexpected results, especially when comparing strings. One such example is the expression:
'hello' <= 'hi' # True
At first glance, it seems counterintuitive. How can a longer string ('hello'
) be less than or equal to a shorter one ('hi'
)? The answer lies in lexicographical ordering.
What is Lexicographical Ordering?
In Python, strings are compared using lexicographical (dictionary) ordering, similar to how words are sorted in a dictionary. This means strings are compared character by character based on their Unicode (or ASCII) values, starting from the first character.
For instance, in the case of 'hello' <= 'hi'
, Python evaluates it as follows:
First Character Comparison:
The first characters of both strings are
'h'
.Since these characters are equal, the comparison proceeds to the next character.
Second Character Comparison:
The second character of
'hello'
is'e'
, and for'hi'
, it is'i'
.Python compares these two characters using their Unicode values:
The Unicode value of
'e'
is 101.The Unicode value of
'i'
is 105.
Since
101 < 105
,'hello'
is considered less than'hi'
.
Thus, even though 'hello'
is longer than 'hi'
, Python determines that 'hello'
is less because 'e'
comes before 'i'
in the Unicode table.
String Comparison in Python
String comparisons in Python use the following comparison operators:
==
: True if both strings are identical.!=
: True if strings are not identical.<
: True if the left string comes before the right string lexicographically.>
: True if the left string comes after the right string lexicographically.<=
: True if the left string is either less than or equal to the right string.>=
: True if the left string is either greater than or equal to the right string.
Examples:
# Case 1: Equal strings
'apple' == 'apple' # True
# Case 2: Lexicographical comparison
'abc' < 'abd' # True ('c' comes before 'd')
# Case 3: Comparing strings of different lengths
'cat' < 'catalog' # True ('cat' is a prefix of 'catalog')
# Case 4: Unicode comparison
'python' > 'Python' # True ('p' has a greater Unicode value than 'P')
Unicode Behind the Scenes
Each character in a string corresponds to a Unicode code point, which is an integer representing the character. For example:
'a'
has the Unicode value 97.'A'
has the Unicode value 65.'e'
has the Unicode value 101.'i'
has the Unicode value 105.
This is why Python compares strings based on their Unicode values, ensuring that even subtle differences between characters are captured.
Practical Implications
Understanding string comparison is crucial when working with sorting functions like sorted()
, min()
, and max()
. These functions rely on lexicographical order to determine the smallest or largest string.
Example:
words = ['banana', 'apple', 'grape', 'pear']
sorted_words = sorted(words) # ['apple', 'banana', 'grape', 'pear']
In this example, Python sorts the list of words lexicographically, not by length or any other property.
Conclusion
The comparison 'hello' <= 'hi'
evaluates to True
because Python compares strings lexicographically. The second character 'e'
in 'hello'
has a smaller Unicode value than 'i'
in 'hi'
, making the entire string 'hello'
less than 'hi'
.
Understanding how Python compares strings helps avoid confusion and ensures you're using comparisons correctly in your programs. When in doubt, remember that Python treats strings much like words in a dictionary—one character at a time.
Subscribe to my newsletter
Read articles from Ahnaf Tahmid Zaman directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Ahnaf Tahmid Zaman
Ahnaf Tahmid Zaman
● Computer Science Undergraduate with a fervent interest in technology and innovation. ● Aspiring to become a DevOps Engineer. Currently learning key DevOps concepts including: Containerization, Orchestration, IaC, CI/CD, Monitoring and Logging, VCS, Configuration Management, Infrastructure Orchestration, Cloud Computing, Security Automation and Compliance, Identity and Access Management (IAM). ● Experienced IT Supervisor adept at guiding technology strategy and troubleshooting network performance. ● Skilled Data Collector ensuring data integrity through meticulous survey documentation. ● Strong leadership and problem-solving abilities, demonstrated through successful project supervision. ● Volunteer experience promoting cultural events and contributing to community support initiatives.