Multi-Key Sorting in Python: A Complete Guide
Sorting by a single key is trivial in Python. The interesting problems start when you need to sort by multiple criteria — primary, secondary, tiebreaker — especially when some fields sort ascending and others descending, or when your data has missing values.
The foundation: tuple comparison
Python compares tuples lexicographically: it compares the first element, and only moves to the second if the first is equal. This maps directly onto multi-key sorting.
(5, 2) < (5, 3) # True — first elements equal, compare second
(4, 9) < (5, 0) # True — first element decides it
This means your key function can return a tuple, and Python's sort will handle priority automatically:
from dataclasses import dataclass
@dataclass
class Athlete:
name: str
total_lifted: int # kg
body_weight: float # kg
athletes = [
Athlete("Alice", 200, 60.0),
Athlete("Bob", 200, 65.0),
Athlete("Charlie", 195, 62.0),
Athlete("Diana", 205, 63.0),
]
# Primary: most weight lifted (desc). Tiebreaker: lowest body weight (asc).
ranked = sorted(athletes, key=lambda a: (-a.total_lifted, a.body_weight))
for i, a in enumerate(ranked, 1):
print(f"{i}. {a.name} — {a.total_lifted}kg lifted, {a.body_weight}kg BW")
Output:
1. Diana — 205kg lifted, 63.0kg BW
2. Alice — 200kg lifted, 60.0kg BW
3. Bob — 200kg lifted, 65.0kg BW
4. Charlie — 195kg lifted, 62.0kg BW
Alice beats Bob despite the same total because she has the lower body weight. The - on total_lifted flips the sort direction for that field while keeping body_weight ascending.
When negation doesn't work
Negation works for numbers. It doesn't work for strings or other types. For mixed directions on non-numeric fields, use two separate sorted() calls — Python's sort is stable, so the order of equal elements from the first pass is preserved in the second:
# Sort by last_name ascending, then by score descending
data = sorted(data, key=lambda x: x.score, reverse=True) # secondary first
data = sorted(data, key=lambda x: x.last_name) # primary second
Counterintuitive but correct: apply the least significant sort first, then the most significant. Stability does the rest.
operator.attrgetter for clean attribute access
For simple attribute chains, operator.attrgetter is cleaner than a lambda and slightly faster:
from operator import attrgetter
# Single key
sorted(athletes, key=attrgetter('total_lifted'))
# Multiple keys (all ascending)
sorted(athletes, key=attrgetter('total_lifted', 'body_weight'))
The multi-key form of attrgetter only supports same-direction sorting — use the tuple lambda approach when you need mixed directions.
Handling None values
Real data has missing values. Comparing None with integers raises a TypeError in Python 3:
sorted([3, None, 1]) # TypeError: '<' not supported between 'NoneType' and 'int'
The standard fix is to push None to the end explicitly:
sorted(values, key=lambda x: (x is None, x))
x is None evaluates to False (0) for real values and True (1) for None, so real values always sort before None. Extend this for multi-key scenarios:
sorted(
athletes,
key=lambda a: (
a.total_lifted is None,
-(a.total_lifted or 0),
a.body_weight is None,
a.body_weight or 0,
)
)
Sorting a DataFrame
The same logic applies to pandas, with cleaner syntax:
import pandas as pd
df = pd.DataFrame([
{"name": "Alice", "total_lifted": 200, "body_weight": 60.0},
{"name": "Bob", "total_lifted": 200, "body_weight": 65.0},
{"name": "Charlie", "total_lifted": 195, "body_weight": 62.0},
{"name": "Diana", "total_lifted": 205, "body_weight": 63.0},
])
df.sort_values(
by=["total_lifted", "body_weight"],
ascending=[False, True]
).reset_index(drop=True)
ascending accepts a list, one value per column — no negation needed, no stability tricks.
Performance
For large datasets, repeated key function calls can add up. functools.cmp_to_key lets you write a comparator function directly, which can be faster when comparisons are complex — but for most sorting tasks the tuple-key approach is fast enough and far more readable.
The practical threshold: if you're sorting millions of objects and the sort is in a hot path, profile first. Otherwise, keep the tuple approach.
Quick reference
| Need | Pattern |
|---|---|
| Multi-key, all ascending | attrgetter('a', 'b') |
| Multi-key, mixed directions (numeric) | lambda x: (-x.a, x.b) |
| Multi-key, mixed directions (strings) | Two-pass stable sort |
Handle None | lambda x: (x is None, x) |
| DataFrame | sort_values(by=[...], ascending=[...]) |