Python Strings - From Basics to Brilliance

Ashuka AcharyaAshuka Acharya
10 min read

Introduction

As a beginner in Python, I’ve realized data text are used everywhere in the real world in tasks like handling user inputs, cleaning and processing data, reading and writing files, debugging and many more. Python strings allow us to create, handle and manipulate these text effectively and efficiently. So in this blog post, I will share what I have learned about Python Strings so far.

What is a String in Python?

A String in Python is a sequence of Unicode characters that is enclosed in single, double or triple quotes. It includes alphabets, digits and special characters. Unlike C/C++, Python has no character data type so single character is a string with length 1.

Creating Strings

In Python, strings can be created using single(‘ ‘), double(“ “) or triple (‘‘‘ ‘‘‘ or “““ “““) quotes.

#Using Single Quotes
print('Hello everyone')

#Using Double Quotes
print("Welcome")

#Using Triple Quotes
print('''Have a nice day''')
print("""What are yo doing?""")

You might be wondering why Python supports three different ways to create a string even though all works the same.

The main reason is to give the user more flexibility especially while dealing with quotes inside strings. It makes it easier to include strings with quotes in it without having to use the escape character. The triple quotes are also used for multi-line printing.

print("It's a nice day.")

print('She said,"Hello everyone".')

print('''Line One
Line Two
Line Three ''')

Accessing Characters and Sub-strings

To access just a character or a part of a string, Python has two ways:

  • Indexing

  • Slicing

  1. Indexing

Since strings are just a sequence of characters, we can use indexing to access individual characters. There are two types of indexing: Positive and Negative.

In Positive Indexing, each character starting from left to right are indexed from 0, then 1 and so on.

In Negative Indexing, each character starting from right to left are indexed from -1, then -2 and so on. This helps us to print the last character without knowing the length of the string.

a='I am a girl'
print(a[0]) #Positive Indexing

print(a[-2]) #Negative Indexing

#Output:
#I
#r
  1. Slicing

To access a part of a string, we use String Slicing.The syntax for slicing is string[start:end:steps], where start starting index, end is stopping index (excluded) and step is the number of characters to jump between slices (Defaults to 1) . Both positive and negative indexing can be used in slicing.

s='Summer'

#using positive indexing
print(s[0:4]) #from index 0 to 3
print(s[3:]) #from index 3 to the end
print(s[:4]) #from the beginning to index 4
print(s[:]) #whole string
print(s[::2]) #every second character i.e. index 0,2,4
print(s[::-1]) #reversed string
print(s[5:0:-1]) #from index 5 to 1 (0 is excluded), backwards
print(s[::1]) #whole string with normal steps
print()

#using negative indexing
print(s[-3:]) #from index -3 to end
print(s[:-3]) #from the beginning to index -4
print(s[-6:-1:2]) #from index -6 to -2 with a step of 2
print(s[-1:-6:-1]) #from last charcter to index -5, reversed
print(s[-1::-1]) #reversed string

#Output
# Summ
# mer
# Summ
# Summer
# Sme
# remmuS
# remmu
# Summer

# mer
# Sum
# Sme
# remmu
# remmuS

Can Strings be Edited or Deleted?

Python Strings are immutable that means you cannot change, update or delete a character of a string after you have created it.

s='Hii'
s[0]='h' #This causes error

Since the modification of Strings is not possible in Python, we use string concatenation, string method and other ways to create a new strings according to our need . (We will cover this in the latter part of the blog.)

About deletion, we cannot delete individual character or a sub-string of a string in Python. But you can delete a whole string at a time. Once deleted, you cannot use that string again unless you assign a new value again.

s='data science'
del s[0] #not possible (error)

s='Good Morning'
del s  #deletes the whole string
print (s) #throws an error as s is deleted in the previous line

Common Operation on Strings

Since Python Strings are immutable, they cannot be changed once created .However, there are various other ways to work around it. Here are some of the most common operation that you will use frequently:

  1. Concatenation

  2. Repetition

  3. Membership

  4. Comparisons

  5. Using loops with Strings

  6. Logical Operation

  • Concatenation

This operation uses a ‘+’ symbol to join two or more strings to form a new string.

name='Ashuka' + ' ' + 'Acharya'
print(name)

#Output
#Ashuka Acharya

. Repetition

This operation joins multiple copy of the string using ‘*’ operator.

print('Divider')
print('-'*15)

#Output
#Divider
#---------------
  • Membership

This operation allows you to check if a character or a sub-string is present or not in a string using ‘in’ and ‘not in’ keywords.

print('J' in 'Java')
print('j' not in 'Java')

#Output
#True

email='ashuka123@mail.com'
if '@' in email:
    print('Valid Email.')

#Output
#Valid Email.
  • Comparisons

This operation compares strings in a lexicographical order that means according to their alphabetical order. Strings are compared character by character based on their Unicode (ASCII) values with the use of ‘<‘ , ‘>’, ‘==’ and ‘!=’ operators.


print('apple' < 'Apple') #since the ASCII value of 'a' > 'A'
print('car' == 'car') #each character has the same ASCII value
print('car' < 'cars') #the shorter string < longer string if all previous characters are equal 
print('app' != 'apple') #since the ASCII values of each character is not equal

#Output
#False
#True
#True
#True
  • Using loops with string

You can iterate through a string using loops (for and while both).

for i in 'Hello':
    print(i,end='_')

print()

s='Ashuka'
while s:
    print(s[0],end=' ')
    s=s[1:]

#Ouput
#H_e_l_l_o_
#A s h u k a
  • Logical Operation

This operation allows you to insert conditions that involves strings using ‘and’ , ‘or’ and ‘not’ operators.

print('-'*15)
print('For and')
print('-'*15)
print('hello' and 'world')
print('hello' and '')
print('-'*15)

print('For or')
print('-'*15)
print('hello' or 'world')
print ('' or 'me')
print('-'*15)

print('For not')
print('-'*15)
print(not' ')
print(not '')

#Output
# ---------------
# For and
# ---------------
# world

# ---------------
# For or
# ---------------
# hello
# me
# ---------------
# For not
# ---------------
# False
# True

Python assumes string with any character as True and empty string as False.

‘and’ → returns the last operand if all operand is True, else the first false operand.

‘or’ → returns the first true operand if any one or both operand is True, else the last false operand.

‘not’ → returns True is the operand is an empty string, else returns False.

Essential String Methods in Python

String methods are the built-in functions in Python that helps us to manipulate or process strings. It helps us to change cases, remove spaces, search a specific word, count the length, split into pieces and many more.

Common Methods

These methods not only work on strings but also on other data types like list, set, dictionary and tuple. They are:

  1. len()

It returns the number of characters in a string.

  1. min()

It returns the character with the least Unicode value.

  1. max()

It returns the character with the highest Unicode value.

  1. sorted()

It sorts the characters of a string according to their Unicode values (in an ascending order) and returns it as a list.


print(len('Data Science')) 

print(min('Happy'))

print(max('Happy'))

print(sorted('Science'))
print(sorted('Science',reverse=True)) #prints in reverse order according to the Unicode value

#Output
# 12
# H
# y
# ['S', 'c', 'c', 'e', 'e', 'i', 'n']
# ['n', 'i', 'e', 'e', 'c', 'c', 'S']

Other Methods

These are the methods that only works on string and not on any other data types. They are:

  • .capitalize()

It converts the first character of a string to uppercase and the rest to lowercase.

  • .title()

It converts the first character of each word present in a string to uppercase and the rest to lowercase.

  • .upper()

It converts each character of a string to uppercase.

  • .lower()

It converts each character of a string to lowercase.

  • .swapcase()

It converts the uppercase characters of a string to lowercase and vice-versa.


s='daTa SCiencE'
print(s.capitalize())
print(s.title())
print(s.upper())
print(s.lower())
print(s.swapcase())

#Output
# Data science
# Data Science
# DATA SCIENCE
# data science
# DAtA scIENCe
  • .count()

It counts the number of times a sub-string appears in a string.

  • .startswith()

It checks if the given string starts with the given sub-string.

  • .endswith()

It checks if the given string ends with the given sub-string.

  • .find()

It returns the first index of the sub-strings. If not found, it returns -1.

  • .index()

It returns the first index of the sub-strings. If not found, it throws an error which is the only difference between .find() and .index().

a='Ashuka is my name'

print(a.count('a'))
print(a.count('is'))
print(a.count('z'))

print(a.startswith('Ash'))
print(a.endswith('me'))
print(s.startswith('m'))
print(s.endswith('ka'))

print(a.find('my'))
print(a.index('my'))
print(a.find('girl'))
# print(a.index('girl')) throws an error

#Output
# 2
# 1
# 0
# True
# True
# False
# False
# 10
# 10
# -1
  • .format()

It inserts variables into the strings.

name='Ashuka'
gender='female'
print('Hi, I am {} and i am a {}'.format(name,gender))

print('Hi, i am {1} and i am a {0}'.format(gender,name))


#Output
# Hi, I am Ashuka and i am a female
# Hi, i am Ashuka and i am a female

While .format() is powerful and widely used, a newer way that is easier and compact has been introduced called the ‘f-strings‘. They work just like format but with lesser typing and better readability.

f-strings allow to insert expressions (variables or calculations) into a string using curly braces {}. It is not a Python Method but rather just a syntax feature.

name='Ashuka'
gender='female'

print(f'Hi,I am {name} and i am a {gender}')

#Output
# Hi, I am Ashuka and i am a female
  • .isalnum()

It checks if all the characters in a string are alphanumeric (only letters and numbers not special symbols or spaces).

  • .isalpha()

It checks if all the characters in a string are alphabets only.

  • .isdigit()

It checks if all the characters in a string are digits only.

  • .isidentifier()

It checks whether the string is a valid identifier i.e if it can be used as a variable name in Python.

  • .isupper()

It checks if all the characters in a string are in uppercase or not.

  • .islower()

It checks if all the characters in a string are in lowercase or not.

print('helloo123'.isalnum())
print('helloo123 '.isalnum())
print()
print('123'.isdigit())
print('123#'.isdigit())
print()
print('helloo'.isalpha())
print('helloo '.isalpha())
print()
print('helloo123'.isidentifier())
print('123helloo123'.isidentifier())
print()
print('HI'.isupper())
print('As'.isupper())
print()
print('hi'.islower())
print('As'.islower())

#Output
# True
# False

# True
# False

# True
# False

# True
# False

# True
# False

# True
# False
  • .split()

It splits a string into a list based on a separator, default is space but we can also use other characters.

  • .join()

It joins a list into a string based on a separator, default is space but we can also use other characters.

  • .strip()

It removes all the extra spaces at the beginning and end of the string, not the spaces in the middle.

  • .replace()

It replaces the sub-string of a string with another one. If we try to replace the sub-string that is not present in the string with another one then, no changes occur.

 print('Hi i am fine'.split())
print('Hi i am fine'.split('am'))
print()
print(''.join(['i','am','fine']))
print(' '.join(['i','am','fine']))
print('-'.join(['i','am','fine']))
print()
print('Hi my name fine'.replace('fine','great'))
print('Hi my name fine'.replace('amazing','great'))
print()
a="   hiii i am     fine      "
print(a.strip())

#Output
# ['Hi', 'i', 'am', 'fine']
# ['Hi i ', ' fine']

# iamfine
# i am fine
# i-am-fine

# Hi my name great
# Hi my name fine

# hiii i am     fine

Question: If strings are immutable, how do these string methods work?

In Python, Strings are immutable i.e it cannot be changed once created. So, when we apply a string method to any string, it does not modify the original string instead it creates a new string and modifies them. No changes are made in the original one.

a="   hiii i am     fine      "
print(a.strip())
print(a) #No changes are made here

#Output
# hiii i am     fine
#    hiii i am     fine

Conclusion

Python strings includes various features like string creation, indexing, slicing, concatenation, repetition, comparisons and many different string methods like find, count, split, join, replace, strip, etc which are super useful in the real world, especially in data science to perform tasks like cleaning CSV files, text processing and many more.


I hope this blog helps you as much as it helped me while writing it.

Feel free to leave a comment or a suggestion below.

5
Subscribe to my newsletter

Read articles from Ashuka Acharya directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ashuka Acharya
Ashuka Acharya