Python Sorting

Ritwik MathRitwik Math
7 min read

Python has a built-in sorted function that generates a sorted sequence from an iterable. Python list comes with a sort method that fulfills the same purpose.

The sorted function accepts one iterable as an argument.

unsorted_list = [4,3,7,3,1,9,5]
sorted_list = sorted(unsorted_list)
print(sorted_list)  # [1, 3, 3, 4, 5, 7, 9]

Optionally, you can pass a function for the argument key. This function is used to sort the iterable.

unsorted_list = [4,3,7,3,1,9,5]
sorted_list = sorted(unsorted_list, key=lambda x: -x)
print(sorted_list)  # [9, 7, 5, 4, 3, 3, 1]

The difference between the sorted function and the list’s sort method is that. The sorted function generates a new list, but the sort method updates it.

Sort a List of Dictionaries

Let’s take a list of dictionaries that represent employees. Each dictionary refers to one employee.

employees = [
    {"name": "Alice", "salary": 80000, "is_active": True},
    {"name": "Bob", "salary": 95000, "is_active": False},
    {"name": "Alice", "salary": 80000, "is_active": False},
    {"name": "Charlie", "salary": 95000, "is_active": True},
]

Now, the job is to sort the list based on employee salary.

sorted_employees = sorted(employees, key=lambda x: x["salary"])

print(sorted_employees)

"""
[
    {'name': 'Alice', 'salary': 80000, 'is_active': True},
    {'name': 'Alice', 'salary': 80000, 'is_active': False},
    {'name': 'Bob', 'salary': 95000, 'is_active': False},
    {'name': 'Charlie', 'salary': 95000, 'is_active': True}
]
"""

By default, the list is sorted in ascending order. To make it sort in descending order, we have to use the negative sign.

sorted_employees = sorted(employees, key=lambda x: -x["salary"])

print(sorted_employees)

"""
[
    {'name': 'Bob', 'salary': 95000, 'is_active': False},
    {'name': 'Charlie', 'salary': 95000, 'is_active': True},
    {'name': 'Alice', 'salary': 80000, 'is_active': True},
    {'name': 'Alice', 'salary': 80000, 'is_active': False}
]
"""

If this list needs to be sorted in such a way that active employees will get precedence if the same salary is equal. Now, we have to sort using two items, one is salary, the one is is_active.

Hint:

If you put a ‘+‘ or ‘-’ sign in front of a boolean value, it is converted into an integer.

sorted_employees = sorted(employees, key=lambda x: (-x["salary"], -x["is_active"]))

print(sorted_employees)

"""
[
    {'name': 'Charlie', 'salary': 95000, 'is_active': True},
    {'name': 'Bob', 'salary': 95000, 'is_active': False},
    {'name': 'Alice', 'salary': 80000, 'is_active': True},
    {'name': 'Alice', 'salary': 80000, 'is_active': False}
]
"""

Instead of returning an integer, only salary, lambda now returns a tuple containing two elements.

Sort Special Characters

A special character is a symbol that isn’t a standard letter or a number. ‘Å’ is a special character. If we can break this, we get the following.

If we try to sort these special characters like normal letters, we will see a randomly sorted list of strings.

names = ["Zoë", "Åsa", "Émile", "Ana", "Zoe", "Édouard"]

sorted_names = sorted(names)

print(sorted_names) # ['Ana', 'Zoe', 'Zoë', 'Åsa', 'Édouard', 'Émile']

To sort the list, we can use the unicodedata module’s normalize method. The ‘normalize’ method accepts two arguments: normalization form, NFD or NFC, and string.

There is an issue in this approach, the normalization form we have to pass at the time of defining the sorted function. To resolve it, we can use the functools module’s partial method. Partial method partially initializes a function with some of the arguments already provided before the function is called.

from unicodedata import normalize
from functools import partial

names = ["Zoë", "Åsa", "Émile", "Ana", "Zoe", "Édouard"]

sorted_names = sorted(names, key=partial(normalize, "NFD"))

print(sorted_names)  # ['Ana', 'Åsa', 'Édouard', 'Émile', 'Zoe', 'Zoë']

The following code is not a good practice but will give you a clear picture of what is happening

sorted_names = sorted(
    names,
    key=lambda x: partial(normalize, "NFD")(x)
)

Sort a List of Class Instances

We have an Employee class. Employee class has attributes id, name, department, email, is_active, and salary.

class Employee:
    def __init__(self, id, name, department, email, is_active, salary):
        self.id = id
        self.name = name
        self.department = department
        self.email = email
        self.is_active = is_active
        self.salary = salary

    def __repr__(self):
        return f"Employee({self.id}, {self.name}, {self.department}, ${self.salary})"

Let’s create a list containing instances of the Employee class.

employee_dicts = [
    {"id": 1001, "name": "Sofia Müller", "department": "Engineering", "email": "sofia.muller@company.com", "is_active": True, "salary": 91000},
    {"id": 1002, "name": "Lucas Pereira", "department": "Finance", "email": "lucas.pereira@company.com", "is_active": False, "salary": 88000},
    {"id": 1003, "name": "Anika Mehta", "department": "Marketing", "email": "anika.mehta@company.com", "is_active": True, "salary": 73000},
    {"id": 1004, "name": "Jakub Nowak", "department": "Sales", "email": "jakub.nowak@company.com", "is_active": False, "salary": 99000},
    {"id": 1005, "name": "Zoë Dubois", "department": "HR", "email": "zoe.dubois@company.com", "is_active": True, "salary": 76000}
]

employees = [Employee(**data) for data in employee_dicts]

The task is to sort the list in ascending order of employee salary. We can only use the sort method without passing any function to the argument ‘key‘.

employees.sort()

print(employees)

"""
Traceback (most recent call last):
  File "main.py", line 23, in <module>
    employees.sort()
TypeError: '<' not supported between instances of 'Employee' and 'Employee'
"""

The error is generated because, by default, sort uses less than or __lt__ under method to compare the elements in a list. If we define the __lt__ in the Employee class, then the above code will work.

employees = [Employee(**data) for data in employee_dicts]

Employee.__lt__ = lambda self, other: self.salary < other.salary

employees.sort()

print(employees)

"""
[
    Employee(1003, Anika Mehta, Marketing, $73000),
    Employee(1005, Zoë Dubois, HR, $76000),
    Employee(1002, Lucas Pereira, Finance, $88000),
    Employee(1001, Sofia Müller, Engineering, $91000),
    Employee(1004, Jakub Nowak, Sales, $99000)
]
"""

We can also define the method at the time of defining the Employee class.

class Employee:
    def __init__(self, id, name, department, email, is_active, salary):
        self.id = id
        self.name = name
        self.department = department
        self.email = email
        self.is_active = is_active
        self.salary = salary

    def __lt__(self, other):
        return self.salary < other.salary

    def __repr__(self):
        return f"Employee({self.id}, {self.name}, {self.department}, ${self.salary})"

Reverse a List

We can use the ‘reverse’ built-in function or use a negative sign, but the best way is to use the reverse keyword argument on the sorted function.

sorted_employees = sorted(employees, key=lambda x: x.salary, reverse=True)

"""
[
    Employee(1004, Jakub Nowak, Sales, $99000),
    Employee(1001, Sofia Müller, Engineering, $91000),
    Employee(1002, Lucas Pereira, Finance, $88000),
    Employee(1005, Zoë Dubois, HR, $76000),
    Employee(1003, Anika Mehta, Marketing, $73000)
]
"""

Exercises:

Now let’s try to solve three questions. Please try to complete the exercise by yourself first. Do not look at the answers first.

Q1. Sort a List of Dictionaries by Multiple Criteria

Sort a list of employees (dicts) by:

Highest salary

  • If salary is same, then by name in alphabetical order

  • If name is also same, sort active employees first

  • python Copy Edit

employees = [
    {"name": "Alice", "salary": 80000, "is_active": True},
    {"name": "Bob", "salary": 95000, "is_active": False},
    {"name": "Alice", "salary": 80000, "is_active": False},
    {"name": "Charlie", "salary": 95000, "is_active": True},
]

Q2. Sort with Mixed Types and Handle Errors Gracefully

You’re given a list with integers, floats, strings, and None. Sort this list numerically by treating strings as numbers where possible, and pushing None to the end.

data = [3, '5', 2.1, None, '10.5', 7, 'abc', None, 1]

Answer

# Firt try
[1, 2.1, 3, '5', 7, '10.5', 'abc', None, None]

# Bonus try to get the following result
[1, '10.5', 2.1, 3, '5', 7, 'abc', None, None]

Q3. Sort a List of Strings by Their Frequency in Another List

preferred_order = ['low', 'medium', 'high', 'critical']
tickets = ['high', 'low', 'critical', 'medium', 'medium', 'low']

Sort tickets according to the order in preferred_order.

Bonus: If a word is not in preferred_order, put it at the end.

Q1 Answer:

employees = [
    {"name": "Alice", "salary": 80000, "is_active": True},
    {"name": "Bob", "salary": 95000, "is_active": False},
    {"name": "Alice", "salary": 80000, "is_active": False},
    {"name": "Charlie", "salary": 95000, "is_active": True},
]

sorted_employees = sorted(employees, key=lambda x: (-x["salary"], x["name"], -x["is_active"]))

Q2 Answer:

data = [3, '5', 2.1, None, '10.5', 7, 'abc', None, 1]

def sortDifferentDataTypes(v):
    if v is None:
        return float('inf')
    if type(v) == str:
        try:
            return float(v)
        except:
            return float(sum(list(bytearray(v.encode('utf-8')))))
    return float(v)

sorted_data = sorted(data, key=sortDifferentDataTypes)

print(sorted_data)  # [1, 2.1, 3, '5', 7, '10.5', 'abc', None, None]

Bonus

data = [3, '5', 2.1, None, '10.5', 7, 'abc', None, 1]

def sortDifferentDataTypes(v):
    if v is None:
        return "".join(['z']*100)
    return str(v)

sorted_data = sorted(data, key=sortDifferentDataTypes)

print(sorted_data)  # [1, '10.5', 2.1, 3, '5', 7, 'abc', None, None]

Q3 Answer:

preferred_order = ['low', 'medium', 'high', 'critical']
tickets = ['high', 'low', 'critical', 'medium', 'medium', 'low']


prefered_index = {val:key for key, val in enumerate(preferred_order)}

sorted_tickets = sorted(tickets, key=prefered_index.__getitem__)

print(sorted_tickets)  # ['low', 'low', 'medium', 'medium', 'high', 'critical']

Bonus

Assume ‘Critical’ is not in the preferred_order list.

from functools import partial

preferred_order = ['low', 'medium', 'high']
tickets = ['high', 'low', 'critical', 'medium', 'medium', 'low']

def sortUsingPreference(index_dict, key):
    if key not in index_dict:
        return float('inf')
    return float(index_dict[key])


prefered_index = {val:key for key, val in enumerate(preferred_order)}

sorted_tickets = sorted(tickets, key=partial(sortUsingPreference, prefered_index))

print(sorted_tickets)
0
Subscribe to my newsletter

Read articles from Ritwik Math directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ritwik Math
Ritwik Math

As a seasoned senior web developer with a wealth of experience in Python, Javascript, web development, MySQL, MongoDB, and React, I am passionate about crafting exceptional digital experiences that delight users and drive business success.