Python itertools Tutorial: Mastering Efficient Iteration

Vigneswaran SVigneswaran S
10 min read

The itertools module in Python provides a collection of fast, memory-efficient tools that are useful by themselves or in combination. They are designed to work together to form more complex iterator pipelines. If you find yourself writing loops with lists, especially if those lists can get very large, itertools often offers a more Pythonic and performant alternative.

Why itertools?

  1. Memory Efficiency: itertools functions return iterators, which produce items one at a time. This means they don't load entire sequences into memory, making them ideal for large or even infinite sequences.

  2. Performance: Many itertools functions are implemented in C under the hood, making them highly optimized for speed.

  3. Code Conciseness: They allow you to express complex iteration patterns in a more compact and readable way.

  4. Mathematical Combinatorics: They provide built-in solutions for common combinatorial problems like permutations, combinations, and cartesian products.

Getting Started

To use itertools, you simply import it:

Python

import itertools

Let's dive into some of the most commonly used functions.


1. Infinite Iterators

These iterators generate sequences indefinitely. You'll typically use break conditions or islice to limit their output.

count(start=0, step=1)

Creates an iterator that returns evenly spaced values starting with start.

Python

import itertools

# Count from 0, step by 1 (default)
for i in itertools.count():
    print(i)
    if i >= 5:
        break # Important to break infinite loops!

print("\n---")

# Count from 10, step by 2
for i in itertools.count(10, 2):
    print(i)
    if i >= 20:
        break

cycle(iterable)

Creates an iterator that endlessly repeats the elements of an iterable.

Python

import itertools

count = 0
for item in itertools.cycle(['A', 'B', 'C']):
    print(item)
    count += 1
    if count >= 7: # Limit the output
        break

repeat(object, times=None)

Creates an iterator that repeats object times number of times. If times is None (default), it repeats indefinitely.

Python

import itertools

# Repeat 'Hello' 3 times
for _ in itertools.repeat('Hello', 3):
    print(_)

print("\n---")

# Repeat a list indefinitely (be careful!)
count = 0
for item in itertools.repeat([1, 2, 3]):
    print(item)
    count += 1
    if count >= 4:
        break

2. Terminatinating Iterators

These iterators work on finite input sequences and produce finite output sequences.

chain(*iterables)

Treats multiple iterables as a single sequence.

Python

import itertools

# Chain lists
for item in itertools.chain([1, 2, 3], ('a', 'b'), 'XYZ'):
    print(item)

print("\n---")

# Chain from iterators
iter1 = (x**2 for x in range(3))
iter2 = (chr(ord('a') + i) for i in range(3))
for item in itertools.chain(iter1, iter2):
    print(item)

compress(data, selectors)

Filters elements from data corresponding to True values in selectors.

Python

import itertools

data = ['A', 'B', 'C', 'D', 'E']
selectors = [True, False, True, True, False]

for item in itertools.compress(data, selectors):
    print(item) # Output: A, C, D

print("\n---")

prices = [100, 200, 50, 150, 300]
is_affordable = [True, False, True, True, False]

for price in itertools.compress(prices, is_affordable):
    print(f"Affordable: ${price}")

dropwhile(predicate, iterable)

Drops elements from the iterable as long as the predicate is true. Once the predicate becomes false, all remaining elements are yielded.

Python

import itertools

data = [1, 2, 3, 4, 1, 2, 5, 6]

# Drop elements while they are less than 3
for item in itertools.dropwhile(lambda x: x < 3, data):
    print(item) # Output: 3, 4, 1, 2, 5, 6

print("\n---")

# Drop leading spaces
text = "   hello world"
for char in itertools.dropwhile(lambda x: x == ' ', text):
    print(char, end='')

filterfalse(predicate, iterable)

Filters elements from iterable for which the predicate is False. This is the opposite of filter().

Python

import itertools

numbers = [1, 2, 3, 4, 5, 6]

# Keep elements that are NOT even (i.e., odd numbers)
for num in itertools.filterfalse(lambda x: x % 2 == 0, numbers):
    print(num) # Output: 1, 3, 5

print("\n---")

words = ["apple", "banana", "cat", "dog"]
# Filter out words that start with 'a'
for word in itertools.filterfalse(lambda w: w.startswith('a'), words):
    print(word)

islice(iterable, stop) / islice(iterable, start, stop[, step])

Returns an iterator that yields selected elements from the iterable similar to slicing a list.

Python

import itertools

data = range(10)

# Take first 5 elements
for i in itertools.islice(data, 5):
    print(i) # Output: 0, 1, 2, 3, 4

print("\n---")

# Take elements from index 2 to 7 (exclusive)
for i in itertools.islice(data, 2, 7):
    print(i) # Output: 2, 3, 4, 5, 6

print("\n---")

# Take elements from index 1 to 9 (exclusive), with a step of 2
for i in itertools.islice(data, 1, 10, 2):
    print(i) # Output: 1, 3, 5, 7, 9

print("\n---")

# Using islice with an infinite iterator (very useful!)
for i in itertools.islice(itertools.count(100), 5):
    print(i) # Output: 100, 101, 102, 103, 104

takewhile(predicate, iterable)

Yields elements from the iterable as long as the predicate is true. Once the predicate becomes false, iteration stops.

Python

import itertools

data = [1, 2, 3, 4, 1, 2, 5, 6]

# Take elements while they are less than 3
for item in itertools.takewhile(lambda x: x < 3, data):
    print(item) # Output: 1, 2

print("\n---")

temperatures = [20, 22, 25, 28, 30, 26, 24]
# Take temperatures while they are below 30
for temp in itertools.takewhile(lambda t: t < 30, temperatures):
    print(f"{temp}°C")

tee(iterable, n=2)

Returns n independent iterators from a single iterable. This is useful if you need to iterate over the same data multiple times without recreating it, especially if the original iterable is expensive to generate or can only be iterated once.

Python

import itertools

data = [1, 2, 3, 4]
iter1, iter2 = itertools.tee(data)

print(f"Iterator 1: {list(iter1)}") # Convert to list to demonstrate consumption
print(f"Iterator 2: {list(iter2)}") # Can be consumed independently

print("\n---")

# Example: Processing a file multiple times without reopening
with open('my_data.txt', 'w') as f:
    f.write("Line 1\nLine 2\nLine 3\n")

with open('my_data.txt', 'r') as f:
    lines_iter1, lines_iter2 = itertools.tee(f)

    print("First pass (reading lines):")
    for line in lines_iter1:
        print(line.strip())

    print("\nSecond pass (counting characters):")
    total_chars = 0
    for line in lines_iter2:
        total_chars += len(line.strip())
    print(f"Total characters: {total_chars}")

Note: For tee to work, the original iterable must be exhaustible or able to be rewound. Files work well, but you cannot tee an already-consumed generator without special handling.

zip_longest(*iterables, fillvalue=None)

Aggregates elements from each of the iterables. If the iterables are of different lengths, missing values are filled with fillvalue (defaults to None).

Python

import itertools

names = ['Alice', 'Bob', 'Charlie']
ages = [30, 24]
cities = ['New York', 'London', 'Paris', 'Tokyo']

for name, age, city in itertools.zip_longest(names, ages, cities, fillvalue='N/A'):
    print(f"{name:<10} {age:<5} {city}")

3. Combinatoric Generators

These functions are perfect for generating permutations, combinations, and cartesian products.

product(*iterables, repeat=1)

Generates the Cartesian product of input iterables. Equivalent to nested for-loops.

Python

import itertools

# Product of two lists
for p in itertools.product([1, 2], ['a', 'b']):
    print(p)
# Output: (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')

print("\n---")

# Product with 'repeat'
# Equivalent to: itertools.product(range(2), range(2), range(2))
for p in itertools.product(range(2), repeat=3):
    print(p)
# Output: (0, 0, 0), (0, 0, 1), (0, 1, 0), ..., (1, 1, 1)

print("\n---")

# Generating all possible PINs (4-digit numbers)
digits = '0123456789'
pin_count = 0
for pin_tuple in itertools.product(digits, repeat=4):
    pin = "".join(pin_tuple)
    # print(pin) # Uncomment to see all PINs
    pin_count += 1
print(f"Total 4-digit PINs: {pin_count}")

permutations(iterable, r=None)

Returns successive r-length permutations of elements from the iterable. If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.

Python

import itertools

# All permutations of 'ABC'
for p in itertools.permutations('ABC'):
    print("".join(p))
# Output: ABC, ACB, BAC, BCA, CAB, CBA

print("\n---")

# Permutations of length 2 from 'ABC'
for p in itertools.permutations('ABC', 2):
    print("".join(p))
# Output: AB, AC, BA, BC, CA, CB

combinations(iterable, r)

Returns r-length subsequences of elements from the input iterable where the order of elements does not matter (unique combinations). Elements are treated as unique based on their position, not value.

Python

import itertools

# Combinations of length 2 from 'ABC'
for c in itertools.combinations('ABC', 2):
    print("".join(c))
# Output: AB, AC, BC (order doesn't matter, so BA is not included after AB)

print("\n---")

# Choosing 2 items from a list of numbers
numbers = [1, 2, 3, 4]
for combo in itertools.combinations(numbers, 2):
    print(combo)
# Output: (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)

combinations_with_replacement(iterable, r)

Returns r-length subsequences of elements from the input iterable allowing individual elements to be repeated.

Python

import itertools

# Combinations with replacement of length 2 from 'ABC'
for c in itertools.combinations_with_replacement('ABC', 2):
    print("".join(c))
# Output: AA, AB, AC, BB, BC, CC

print("\n---")

# Possible coin tosses (2 tosses with heads/tails)
coins = ['H', 'T']
for result in itertools.combinations_with_replacement(coins, 2):
    print(result)
# Output: ('H', 'H'), ('H', 'T'), ('T', 'T')

4. Grouping Data

groupby(iterable, key=None)

Makes an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If no key function is specified or is None, the element itself is used as the key.

Important: groupby() only groups consecutive elements with the same key. The iterable must be sorted on the key function for groupby() to work as expected across the entire dataset.

Python

import itertools

data = [
    {'name': 'Alice', 'age': 30},
    {'name': 'Bob', 'age': 25},
    {'name': 'Charlie', 'age': 30},
    {'name': 'David', 'age': 25},
    {'name': 'Eve', 'age': 30},
]

# Sort the data by age for proper grouping
data.sort(key=lambda x: x['age'])

print("Sorted data:")
for item in data:
    print(item)

print("\n--- Grouping by age ---")
for age, group in itertools.groupby(data, key=lambda x: x['age']):
    print(f"Age: {age}")
    for person in group:
        print(f"  - {person['name']}")

print("\n--- Example with unsorted data (to show the effect) ---")
unsorted_data = [1, 2, 2, 3, 1, 1, 2]
for k, g in itertools.groupby(unsorted_data):
    print(f"Key: {k}, Group: {list(g)}")
# Output:
# Key: 1, Group: [1]
# Key: 2, Group: [2, 2]
# Key: 3, Group: [3]
# Key: 1, Group: [1, 1]
# Key: 2, Group: [2]

5. Utility Functions

accumulate(iterable, func=operator.add)

Makes an iterator that returns accumulated sums, or accumulated results of other binary functions (specified by the func argument).

Python

import itertools
import operator # For using operator functions like add, mul

numbers = [1, 2, 3, 4, 5]

# Accumulated sums (default)
for s in itertools.accumulate(numbers):
    print(s) # Output: 1, 3, 6, 10, 15 (1, 1+2, 1+2+3, ...)

print("\n---")

# Accumulated products
for p in itertools.accumulate(numbers, operator.mul):
    print(p) # Output: 1, 2, 6, 24, 120 (1, 1*2, 1*2*3, ...)

print("\n---")

# Accumulated max
for m in itertools.accumulate(numbers, max):
    print(m) # Output: 1, 2, 3, 4, 5 (max up to current point)

print("\n---")

# Running balance in a financial ledger
transactions = [100, -20, 50, -30, 75]
balance = 0
for t in itertools.accumulate(transactions, lambda current_balance, transaction: current_balance + transaction, initial=balance):
    print(f"Current balance: {t}")

Combining itertools Functions

The real power of itertools comes when you combine these functions to create sophisticated and efficient data processing pipelines.

Example: Finding unique permutations of a string with repeated characters

If you use itertools.permutations on a string like 'AAB', you'll get 'AAB', 'AAB', 'ABA', 'ABA', 'BAA', 'BAA'. To get unique ones, you can use set.

Python

import itertools

s = "AAB"
unique_permutations = set("".join(p) for p in itertools.permutations(s))
print(f"Unique permutations of '{s}': {sorted(list(unique_permutations))}")
# Output: ['AAB', 'ABA', 'BAA']

Example: Generating all possible poker hands (simplified)

Let's say we have a deck of cards and we want to find all combinations of 2 cards.

Python

import itertools

suits = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
ranks = ['2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A']

deck = [f"{rank} of {suit}" for rank in ranks for suit in suits]

# Get all combinations of 2 cards (a 2-card hand)
num_hands = 0
for hand in itertools.combinations(deck, 2):
    # print(hand) # Uncomment to see all hands
    num_hands += 1

print(f"Total number of 2-card hands: {num_hands}")

# Using a smaller deck for demonstration
small_deck = ['A_H', 'K_H', 'Q_H', 'J_H']
print("\n--- Example 2-card hands from a small deck ---")
for hand in itertools.combinations(small_deck, 2):
    print(hand)

Conclusion

The itertools module is an indispensable part of Python's standard library for anyone dealing with iterations and data processing. By understanding and utilizing these functions, you can write more efficient, readable, and Pythonic code, especially when working with large datasets or complex combinatorial problems. Practice combining them to unlock their full potential!

0
Subscribe to my newsletter

Read articles from Vigneswaran S directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vigneswaran S
Vigneswaran S

With profound zeal, I delve into the essence of coding, striving to imbue it with beauty and clarity. Conjuring wonders through code is, to me, a delightful pastime interwoven with an enduring passion.