Python itertools Tutorial: Mastering Efficient Iteration


The itertools
module in Python provides a collection of fast, memory-efficient tools that are useful by themselves or in combination. They are designed to work together to form more complex iterator pipelines. If you find yourself writing loops with lists, especially if those lists can get very large, itertools
often offers a more Pythonic and performant alternative.
Why itertools
?
Memory Efficiency:
itertools
functions return iterators, which produce items one at a time. This means they don't load entire sequences into memory, making them ideal for large or even infinite sequences.Performance: Many
itertools
functions are implemented in C under the hood, making them highly optimized for speed.Code Conciseness: They allow you to express complex iteration patterns in a more compact and readable way.
Mathematical Combinatorics: They provide built-in solutions for common combinatorial problems like permutations, combinations, and cartesian products.
Getting Started
To use itertools
, you simply import it:
Python
import itertools
Let's dive into some of the most commonly used functions.
1. Infinite Iterators
These iterators generate sequences indefinitely. You'll typically use break
conditions or islice
to limit their output.
count(start=0, step=1)
Creates an iterator that returns evenly spaced values starting with start
.
Python
import itertools
# Count from 0, step by 1 (default)
for i in itertools.count():
print(i)
if i >= 5:
break # Important to break infinite loops!
print("\n---")
# Count from 10, step by 2
for i in itertools.count(10, 2):
print(i)
if i >= 20:
break
cycle(iterable)
Creates an iterator that endlessly repeats the elements of an iterable
.
Python
import itertools
count = 0
for item in itertools.cycle(['A', 'B', 'C']):
print(item)
count += 1
if count >= 7: # Limit the output
break
repeat(object, times=None)
Creates an iterator that repeats object
times
number of times. If times
is None
(default), it repeats indefinitely.
Python
import itertools
# Repeat 'Hello' 3 times
for _ in itertools.repeat('Hello', 3):
print(_)
print("\n---")
# Repeat a list indefinitely (be careful!)
count = 0
for item in itertools.repeat([1, 2, 3]):
print(item)
count += 1
if count >= 4:
break
2. Terminatinating Iterators
These iterators work on finite input sequences and produce finite output sequences.
chain(*iterables)
Treats multiple iterables as a single sequence.
Python
import itertools
# Chain lists
for item in itertools.chain([1, 2, 3], ('a', 'b'), 'XYZ'):
print(item)
print("\n---")
# Chain from iterators
iter1 = (x**2 for x in range(3))
iter2 = (chr(ord('a') + i) for i in range(3))
for item in itertools.chain(iter1, iter2):
print(item)
compress(data, selectors)
Filters elements from data
corresponding to True
values in selectors
.
Python
import itertools
data = ['A', 'B', 'C', 'D', 'E']
selectors = [True, False, True, True, False]
for item in itertools.compress(data, selectors):
print(item) # Output: A, C, D
print("\n---")
prices = [100, 200, 50, 150, 300]
is_affordable = [True, False, True, True, False]
for price in itertools.compress(prices, is_affordable):
print(f"Affordable: ${price}")
dropwhile(predicate, iterable)
Drops elements from the iterable as long as the predicate
is true. Once the predicate
becomes false, all remaining elements are yielded.
Python
import itertools
data = [1, 2, 3, 4, 1, 2, 5, 6]
# Drop elements while they are less than 3
for item in itertools.dropwhile(lambda x: x < 3, data):
print(item) # Output: 3, 4, 1, 2, 5, 6
print("\n---")
# Drop leading spaces
text = " hello world"
for char in itertools.dropwhile(lambda x: x == ' ', text):
print(char, end='')
filterfalse(predicate, iterable)
Filters elements from iterable
for which the predicate
is False
. This is the opposite of filter()
.
Python
import itertools
numbers = [1, 2, 3, 4, 5, 6]
# Keep elements that are NOT even (i.e., odd numbers)
for num in itertools.filterfalse(lambda x: x % 2 == 0, numbers):
print(num) # Output: 1, 3, 5
print("\n---")
words = ["apple", "banana", "cat", "dog"]
# Filter out words that start with 'a'
for word in itertools.filterfalse(lambda w: w.startswith('a'), words):
print(word)
islice(iterable, stop)
/ islice(iterable, start, stop[, step])
Returns an iterator that yields selected elements from the iterable
similar to slicing a list.
Python
import itertools
data = range(10)
# Take first 5 elements
for i in itertools.islice(data, 5):
print(i) # Output: 0, 1, 2, 3, 4
print("\n---")
# Take elements from index 2 to 7 (exclusive)
for i in itertools.islice(data, 2, 7):
print(i) # Output: 2, 3, 4, 5, 6
print("\n---")
# Take elements from index 1 to 9 (exclusive), with a step of 2
for i in itertools.islice(data, 1, 10, 2):
print(i) # Output: 1, 3, 5, 7, 9
print("\n---")
# Using islice with an infinite iterator (very useful!)
for i in itertools.islice(itertools.count(100), 5):
print(i) # Output: 100, 101, 102, 103, 104
takewhile(predicate, iterable)
Yields elements from the iterable as long as the predicate
is true. Once the predicate
becomes false, iteration stops.
Python
import itertools
data = [1, 2, 3, 4, 1, 2, 5, 6]
# Take elements while they are less than 3
for item in itertools.takewhile(lambda x: x < 3, data):
print(item) # Output: 1, 2
print("\n---")
temperatures = [20, 22, 25, 28, 30, 26, 24]
# Take temperatures while they are below 30
for temp in itertools.takewhile(lambda t: t < 30, temperatures):
print(f"{temp}°C")
tee(iterable, n=2)
Returns n
independent iterators from a single iterable
. This is useful if you need to iterate over the same data multiple times without recreating it, especially if the original iterable is expensive to generate or can only be iterated once.
Python
import itertools
data = [1, 2, 3, 4]
iter1, iter2 = itertools.tee(data)
print(f"Iterator 1: {list(iter1)}") # Convert to list to demonstrate consumption
print(f"Iterator 2: {list(iter2)}") # Can be consumed independently
print("\n---")
# Example: Processing a file multiple times without reopening
with open('my_data.txt', 'w') as f:
f.write("Line 1\nLine 2\nLine 3\n")
with open('my_data.txt', 'r') as f:
lines_iter1, lines_iter2 = itertools.tee(f)
print("First pass (reading lines):")
for line in lines_iter1:
print(line.strip())
print("\nSecond pass (counting characters):")
total_chars = 0
for line in lines_iter2:
total_chars += len(line.strip())
print(f"Total characters: {total_chars}")
Note: For tee
to work, the original iterable must be exhaustible or able to be rewound. Files work well, but you cannot tee
an already-consumed generator without special handling.
zip_longest(*iterables, fillvalue=None)
Aggregates elements from each of the iterables. If the iterables are of different lengths, missing values are filled with fillvalue
(defaults to None
).
Python
import itertools
names = ['Alice', 'Bob', 'Charlie']
ages = [30, 24]
cities = ['New York', 'London', 'Paris', 'Tokyo']
for name, age, city in itertools.zip_longest(names, ages, cities, fillvalue='N/A'):
print(f"{name:<10} {age:<5} {city}")
3. Combinatoric Generators
These functions are perfect for generating permutations, combinations, and cartesian products.
product(*iterables, repeat=1)
Generates the Cartesian product of input iterables. Equivalent to nested for-loops.
Python
import itertools
# Product of two lists
for p in itertools.product([1, 2], ['a', 'b']):
print(p)
# Output: (1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')
print("\n---")
# Product with 'repeat'
# Equivalent to: itertools.product(range(2), range(2), range(2))
for p in itertools.product(range(2), repeat=3):
print(p)
# Output: (0, 0, 0), (0, 0, 1), (0, 1, 0), ..., (1, 1, 1)
print("\n---")
# Generating all possible PINs (4-digit numbers)
digits = '0123456789'
pin_count = 0
for pin_tuple in itertools.product(digits, repeat=4):
pin = "".join(pin_tuple)
# print(pin) # Uncomment to see all PINs
pin_count += 1
print(f"Total 4-digit PINs: {pin_count}")
permutations(iterable, r=None)
Returns successive r
-length permutations of elements from the iterable
. If r
is not specified or is None
, then r
defaults to the length of the iterable
and all possible full-length permutations are generated.
Python
import itertools
# All permutations of 'ABC'
for p in itertools.permutations('ABC'):
print("".join(p))
# Output: ABC, ACB, BAC, BCA, CAB, CBA
print("\n---")
# Permutations of length 2 from 'ABC'
for p in itertools.permutations('ABC', 2):
print("".join(p))
# Output: AB, AC, BA, BC, CA, CB
combinations(iterable, r)
Returns r
-length subsequences of elements from the input iterable
where the order of elements does not matter (unique combinations). Elements are treated as unique based on their position, not value.
Python
import itertools
# Combinations of length 2 from 'ABC'
for c in itertools.combinations('ABC', 2):
print("".join(c))
# Output: AB, AC, BC (order doesn't matter, so BA is not included after AB)
print("\n---")
# Choosing 2 items from a list of numbers
numbers = [1, 2, 3, 4]
for combo in itertools.combinations(numbers, 2):
print(combo)
# Output: (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)
combinations_with_replacement(iterable, r)
Returns r
-length subsequences of elements from the input iterable
allowing individual elements to be repeated.
Python
import itertools
# Combinations with replacement of length 2 from 'ABC'
for c in itertools.combinations_with_replacement('ABC', 2):
print("".join(c))
# Output: AA, AB, AC, BB, BC, CC
print("\n---")
# Possible coin tosses (2 tosses with heads/tails)
coins = ['H', 'T']
for result in itertools.combinations_with_replacement(coins, 2):
print(result)
# Output: ('H', 'H'), ('H', 'T'), ('T', 'T')
4. Grouping Data
groupby(iterable, key=None)
Makes an iterator that returns consecutive keys and groups from the iterable
. The key
is a function computing a key value for each element. If no key
function is specified or is None
, the element itself is used as the key.
Important: groupby()
only groups consecutive elements with the same key. The iterable must be sorted on the key function for groupby()
to work as expected across the entire dataset.
Python
import itertools
data = [
{'name': 'Alice', 'age': 30},
{'name': 'Bob', 'age': 25},
{'name': 'Charlie', 'age': 30},
{'name': 'David', 'age': 25},
{'name': 'Eve', 'age': 30},
]
# Sort the data by age for proper grouping
data.sort(key=lambda x: x['age'])
print("Sorted data:")
for item in data:
print(item)
print("\n--- Grouping by age ---")
for age, group in itertools.groupby(data, key=lambda x: x['age']):
print(f"Age: {age}")
for person in group:
print(f" - {person['name']}")
print("\n--- Example with unsorted data (to show the effect) ---")
unsorted_data = [1, 2, 2, 3, 1, 1, 2]
for k, g in itertools.groupby(unsorted_data):
print(f"Key: {k}, Group: {list(g)}")
# Output:
# Key: 1, Group: [1]
# Key: 2, Group: [2, 2]
# Key: 3, Group: [3]
# Key: 1, Group: [1, 1]
# Key: 2, Group: [2]
5. Utility Functions
accumulate(iterable, func=operator.add)
Makes an iterator that returns accumulated sums, or accumulated results of other binary functions (specified by the func
argument).
Python
import itertools
import operator # For using operator functions like add, mul
numbers = [1, 2, 3, 4, 5]
# Accumulated sums (default)
for s in itertools.accumulate(numbers):
print(s) # Output: 1, 3, 6, 10, 15 (1, 1+2, 1+2+3, ...)
print("\n---")
# Accumulated products
for p in itertools.accumulate(numbers, operator.mul):
print(p) # Output: 1, 2, 6, 24, 120 (1, 1*2, 1*2*3, ...)
print("\n---")
# Accumulated max
for m in itertools.accumulate(numbers, max):
print(m) # Output: 1, 2, 3, 4, 5 (max up to current point)
print("\n---")
# Running balance in a financial ledger
transactions = [100, -20, 50, -30, 75]
balance = 0
for t in itertools.accumulate(transactions, lambda current_balance, transaction: current_balance + transaction, initial=balance):
print(f"Current balance: {t}")
Combining itertools
Functions
The real power of itertools
comes when you combine these functions to create sophisticated and efficient data processing pipelines.
Example: Finding unique permutations of a string with repeated characters
If you use itertools.permutations
on a string like 'AAB', you'll get 'AAB', 'AAB', 'ABA', 'ABA', 'BAA', 'BAA'. To get unique ones, you can use set
.
Python
import itertools
s = "AAB"
unique_permutations = set("".join(p) for p in itertools.permutations(s))
print(f"Unique permutations of '{s}': {sorted(list(unique_permutations))}")
# Output: ['AAB', 'ABA', 'BAA']
Example: Generating all possible poker hands (simplified)
Let's say we have a deck of cards and we want to find all combinations of 2 cards.
Python
import itertools
suits = ['Hearts', 'Diamonds', 'Clubs', 'Spades']
ranks = ['2', '3', '4', '5', '6', '7', '8', '9', '10', 'J', 'Q', 'K', 'A']
deck = [f"{rank} of {suit}" for rank in ranks for suit in suits]
# Get all combinations of 2 cards (a 2-card hand)
num_hands = 0
for hand in itertools.combinations(deck, 2):
# print(hand) # Uncomment to see all hands
num_hands += 1
print(f"Total number of 2-card hands: {num_hands}")
# Using a smaller deck for demonstration
small_deck = ['A_H', 'K_H', 'Q_H', 'J_H']
print("\n--- Example 2-card hands from a small deck ---")
for hand in itertools.combinations(small_deck, 2):
print(hand)
Conclusion
The itertools
module is an indispensable part of Python's standard library for anyone dealing with iterations and data processing. By understanding and utilizing these functions, you can write more efficient, readable, and Pythonic code, especially when working with large datasets or complex combinatorial problems. Practice combining them to unlock their full potential!
Subscribe to my newsletter
Read articles from Vigneswaran S directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Vigneswaran S
Vigneswaran S
With profound zeal, I delve into the essence of coding, striving to imbue it with beauty and clarity. Conjuring wonders through code is, to me, a delightful pastime interwoven with an enduring passion.