Python Strings - From Basics to Brilliance


Introduction
As a beginner in Python, I’ve realized data text are used everywhere in the real world in tasks like handling user inputs, cleaning and processing data, reading and writing files, debugging and many more. Python strings allow us to create, handle and manipulate these text effectively and efficiently. So in this blog post, I will share what I have learned about Python Strings so far.
What is a String in Python?
A String in Python is a sequence of Unicode characters that is enclosed in single, double or triple quotes. It includes alphabets, digits and special characters. Unlike C/C++, Python has no character data type so single character is a string with length 1.
Creating Strings
In Python, strings can be created using single(‘ ‘), double(“ “) or triple (‘‘‘ ‘‘‘ or “““ “““) quotes.
#Using Single Quotes
print('Hello everyone')
#Using Double Quotes
print("Welcome")
#Using Triple Quotes
print('''Have a nice day''')
print("""What are yo doing?""")
You might be wondering why Python supports three different ways to create a string even though all works the same.
The main reason is to give the user more flexibility especially while dealing with quotes inside strings. It makes it easier to include strings with quotes in it without having to use the escape character. The triple quotes are also used for multi-line printing.
print("It's a nice day.")
print('She said,"Hello everyone".')
print('''Line One
Line Two
Line Three ''')
Accessing Characters and Sub-strings
To access just a character or a part of a string, Python has two ways:
Indexing
Slicing
- Indexing
Since strings are just a sequence of characters, we can use indexing to access individual characters. There are two types of indexing: Positive and Negative.
In Positive Indexing, each character starting from left to right are indexed from 0, then 1 and so on.
In Negative Indexing, each character starting from right to left are indexed from -1, then -2 and so on. This helps us to print the last character without knowing the length of the string.
a='I am a girl'
print(a[0]) #Positive Indexing
print(a[-2]) #Negative Indexing
#Output:
#I
#r
- Slicing
To access a part of a string, we use String Slicing.The syntax for slicing is string[start:end:steps], where start starting index, end is stopping index (excluded) and step is the number of characters to jump between slices (Defaults to 1) . Both positive and negative indexing can be used in slicing.
s='Summer'
#using positive indexing
print(s[0:4]) #from index 0 to 3
print(s[3:]) #from index 3 to the end
print(s[:4]) #from the beginning to index 4
print(s[:]) #whole string
print(s[::2]) #every second character i.e. index 0,2,4
print(s[::-1]) #reversed string
print(s[5:0:-1]) #from index 5 to 1 (0 is excluded), backwards
print(s[::1]) #whole string with normal steps
print()
#using negative indexing
print(s[-3:]) #from index -3 to end
print(s[:-3]) #from the beginning to index -4
print(s[-6:-1:2]) #from index -6 to -2 with a step of 2
print(s[-1:-6:-1]) #from last charcter to index -5, reversed
print(s[-1::-1]) #reversed string
#Output
# Summ
# mer
# Summ
# Summer
# Sme
# remmuS
# remmu
# Summer
# mer
# Sum
# Sme
# remmu
# remmuS
Can Strings be Edited or Deleted?
Python Strings are immutable that means you cannot change, update or delete a character of a string after you have created it.
s='Hii'
s[0]='h' #This causes error
Since the modification of Strings is not possible in Python, we use string concatenation, string method and other ways to create a new strings according to our need . (We will cover this in the latter part of the blog.)
About deletion, we cannot delete individual character or a sub-string of a string in Python. But you can delete a whole string at a time. Once deleted, you cannot use that string again unless you assign a new value again.
s='data science'
del s[0] #not possible (error)
s='Good Morning'
del s #deletes the whole string
print (s) #throws an error as s is deleted in the previous line
Common Operation on Strings
Since Python Strings are immutable, they cannot be changed once created .However, there are various other ways to work around it. Here are some of the most common operation that you will use frequently:
Concatenation
Repetition
Membership
Comparisons
Using loops with Strings
Logical Operation
Concatenation
This operation uses a ‘+’ symbol to join two or more strings to form a new string.
name='Ashuka' + ' ' + 'Acharya'
print(name)
#Output
#Ashuka Acharya
. Repetition
This operation joins multiple copy of the string using ‘*’ operator.
print('Divider')
print('-'*15)
#Output
#Divider
#---------------
Membership
This operation allows you to check if a character or a sub-string is present or not in a string using ‘in’ and ‘not in’ keywords.
print('J' in 'Java')
print('j' not in 'Java')
#Output
#True
email='ashuka123@mail.com'
if '@' in email:
print('Valid Email.')
#Output
#Valid Email.
Comparisons
This operation compares strings in a lexicographical order that means according to their alphabetical order. Strings are compared character by character based on their Unicode (ASCII) values with the use of ‘<‘ , ‘>’, ‘==’ and ‘!=’ operators.
print('apple' < 'Apple') #since the ASCII value of 'a' > 'A'
print('car' == 'car') #each character has the same ASCII value
print('car' < 'cars') #the shorter string < longer string if all previous characters are equal
print('app' != 'apple') #since the ASCII values of each character is not equal
#Output
#False
#True
#True
#True
Using loops with string
You can iterate through a string using loops (for and while both).
for i in 'Hello':
print(i,end='_')
print()
s='Ashuka'
while s:
print(s[0],end=' ')
s=s[1:]
#Ouput
#H_e_l_l_o_
#A s h u k a
Logical Operation
This operation allows you to insert conditions that involves strings using ‘and’ , ‘or’ and ‘not’ operators.
print('-'*15)
print('For and')
print('-'*15)
print('hello' and 'world')
print('hello' and '')
print('-'*15)
print('For or')
print('-'*15)
print('hello' or 'world')
print ('' or 'me')
print('-'*15)
print('For not')
print('-'*15)
print(not' ')
print(not '')
#Output
# ---------------
# For and
# ---------------
# world
# ---------------
# For or
# ---------------
# hello
# me
# ---------------
# For not
# ---------------
# False
# True
Python assumes string with any character as True and empty string as False.
‘and’ → returns the last operand if all operand is True, else the first false operand.
‘or’ → returns the first true operand if any one or both operand is True, else the last false operand.
‘not’ → returns True is the operand is an empty string, else returns False.
Essential String Methods in Python
String methods are the built-in functions in Python that helps us to manipulate or process strings. It helps us to change cases, remove spaces, search a specific word, count the length, split into pieces and many more.
Common Methods
These methods not only work on strings but also on other data types like list, set, dictionary and tuple. They are:
len()
It returns the number of characters in a string.
min()
It returns the character with the least Unicode value.
max()
It returns the character with the highest Unicode value.
sorted()
It sorts the characters of a string according to their Unicode values (in an ascending order) and returns it as a list.
print(len('Data Science'))
print(min('Happy'))
print(max('Happy'))
print(sorted('Science'))
print(sorted('Science',reverse=True)) #prints in reverse order according to the Unicode value
#Output
# 12
# H
# y
# ['S', 'c', 'c', 'e', 'e', 'i', 'n']
# ['n', 'i', 'e', 'e', 'c', 'c', 'S']
Other Methods
These are the methods that only works on string and not on any other data types. They are:
.capitalize()
It converts the first character of a string to uppercase and the rest to lowercase.
.title()
It converts the first character of each word present in a string to uppercase and the rest to lowercase.
.upper()
It converts each character of a string to uppercase.
.lower()
It converts each character of a string to lowercase.
.swapcase()
It converts the uppercase characters of a string to lowercase and vice-versa.
s='daTa SCiencE'
print(s.capitalize())
print(s.title())
print(s.upper())
print(s.lower())
print(s.swapcase())
#Output
# Data science
# Data Science
# DATA SCIENCE
# data science
# DAtA scIENCe
.count()
It counts the number of times a sub-string appears in a string.
.startswith()
It checks if the given string starts with the given sub-string.
.endswith()
It checks if the given string ends with the given sub-string.
.find()
It returns the first index of the sub-strings. If not found, it returns -1.
.index()
It returns the first index of the sub-strings. If not found, it throws an error which is the only difference between .find() and .index().
a='Ashuka is my name'
print(a.count('a'))
print(a.count('is'))
print(a.count('z'))
print(a.startswith('Ash'))
print(a.endswith('me'))
print(s.startswith('m'))
print(s.endswith('ka'))
print(a.find('my'))
print(a.index('my'))
print(a.find('girl'))
# print(a.index('girl')) throws an error
#Output
# 2
# 1
# 0
# True
# True
# False
# False
# 10
# 10
# -1
.format()
It inserts variables into the strings.
name='Ashuka'
gender='female'
print('Hi, I am {} and i am a {}'.format(name,gender))
print('Hi, i am {1} and i am a {0}'.format(gender,name))
#Output
# Hi, I am Ashuka and i am a female
# Hi, i am Ashuka and i am a female
While .format() is powerful and widely used, a newer way that is easier and compact has been introduced called the ‘f-strings‘. They work just like format but with lesser typing and better readability.
f-strings allow to insert expressions (variables or calculations) into a string using curly braces {}. It is not a Python Method but rather just a syntax feature.
name='Ashuka'
gender='female'
print(f'Hi,I am {name} and i am a {gender}')
#Output
# Hi, I am Ashuka and i am a female
.isalnum()
It checks if all the characters in a string are alphanumeric (only letters and numbers not special symbols or spaces).
.isalpha()
It checks if all the characters in a string are alphabets only.
.isdigit()
It checks if all the characters in a string are digits only.
.isidentifier()
It checks whether the string is a valid identifier i.e if it can be used as a variable name in Python.
.isupper()
It checks if all the characters in a string are in uppercase or not.
.islower()
It checks if all the characters in a string are in lowercase or not.
print('helloo123'.isalnum())
print('helloo123 '.isalnum())
print()
print('123'.isdigit())
print('123#'.isdigit())
print()
print('helloo'.isalpha())
print('helloo '.isalpha())
print()
print('helloo123'.isidentifier())
print('123helloo123'.isidentifier())
print()
print('HI'.isupper())
print('As'.isupper())
print()
print('hi'.islower())
print('As'.islower())
#Output
# True
# False
# True
# False
# True
# False
# True
# False
# True
# False
# True
# False
.split()
It splits a string into a list based on a separator, default is space but we can also use other characters.
.join()
It joins a list into a string based on a separator, default is space but we can also use other characters.
.strip()
It removes all the extra spaces at the beginning and end of the string, not the spaces in the middle.
.replace()
It replaces the sub-string of a string with another one. If we try to replace the sub-string that is not present in the string with another one then, no changes occur.
print('Hi i am fine'.split())
print('Hi i am fine'.split('am'))
print()
print(''.join(['i','am','fine']))
print(' '.join(['i','am','fine']))
print('-'.join(['i','am','fine']))
print()
print('Hi my name fine'.replace('fine','great'))
print('Hi my name fine'.replace('amazing','great'))
print()
a=" hiii i am fine "
print(a.strip())
#Output
# ['Hi', 'i', 'am', 'fine']
# ['Hi i ', ' fine']
# iamfine
# i am fine
# i-am-fine
# Hi my name great
# Hi my name fine
# hiii i am fine
Question: If strings are immutable, how do these string methods work?
In Python, Strings are immutable i.e it cannot be changed once created. So, when we apply a string method to any string, it does not modify the original string instead it creates a new string and modifies them. No changes are made in the original one.
a=" hiii i am fine "
print(a.strip())
print(a) #No changes are made here
#Output
# hiii i am fine
# hiii i am fine
Conclusion
Python strings includes various features like string creation, indexing, slicing, concatenation, repetition, comparisons and many different string methods like find, count, split, join, replace, strip, etc which are super useful in the real world, especially in data science to perform tasks like cleaning CSV files, text processing and many more.
I hope this blog helps you as much as it helped me while writing it.
Feel free to leave a comment or a suggestion below.
Subscribe to my newsletter
Read articles from Ashuka Acharya directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
