← Back to blog
python

Multi-Key Sorting in Python: A Complete Guide

·6 min read

Sorting by a single key is trivial in Python. The interesting problems start when you need to sort by multiple criteria — primary, secondary, tiebreaker — especially when some fields sort ascending and others descending, or when your data has missing values.

The foundation: tuple comparison

Python compares tuples lexicographically: it compares the first element, and only moves to the second if the first is equal. This maps directly onto multi-key sorting.

(5, 2) < (5, 3)   # True — first elements equal, compare second
(4, 9) < (5, 0)   # True — first element decides it

This means your key function can return a tuple, and Python's sort will handle priority automatically:

from dataclasses import dataclass

@dataclass
class Athlete:
    name: str
    total_lifted: int  # kg
    body_weight: float # kg

athletes = [
    Athlete("Alice",   200, 60.0),
    Athlete("Bob",     200, 65.0),
    Athlete("Charlie", 195, 62.0),
    Athlete("Diana",   205, 63.0),
]

# Primary: most weight lifted (desc). Tiebreaker: lowest body weight (asc).
ranked = sorted(athletes, key=lambda a: (-a.total_lifted, a.body_weight))

for i, a in enumerate(ranked, 1):
    print(f"{i}. {a.name}{a.total_lifted}kg lifted, {a.body_weight}kg BW")

Output:

1. Diana   — 205kg lifted, 63.0kg BW
2. Alice   — 200kg lifted, 60.0kg BW
3. Bob     — 200kg lifted, 65.0kg BW
4. Charlie — 195kg lifted, 62.0kg BW

Alice beats Bob despite the same total because she has the lower body weight. The - on total_lifted flips the sort direction for that field while keeping body_weight ascending.

When negation doesn't work

Negation works for numbers. It doesn't work for strings or other types. For mixed directions on non-numeric fields, use two separate sorted() calls — Python's sort is stable, so the order of equal elements from the first pass is preserved in the second:

# Sort by last_name ascending, then by score descending
data = sorted(data, key=lambda x: x.score, reverse=True)   # secondary first
data = sorted(data, key=lambda x: x.last_name)              # primary second

Counterintuitive but correct: apply the least significant sort first, then the most significant. Stability does the rest.

operator.attrgetter for clean attribute access

For simple attribute chains, operator.attrgetter is cleaner than a lambda and slightly faster:

from operator import attrgetter

# Single key
sorted(athletes, key=attrgetter('total_lifted'))

# Multiple keys (all ascending)
sorted(athletes, key=attrgetter('total_lifted', 'body_weight'))

The multi-key form of attrgetter only supports same-direction sorting — use the tuple lambda approach when you need mixed directions.

Handling None values

Real data has missing values. Comparing None with integers raises a TypeError in Python 3:

sorted([3, None, 1])  # TypeError: '<' not supported between 'NoneType' and 'int'

The standard fix is to push None to the end explicitly:

sorted(values, key=lambda x: (x is None, x))

x is None evaluates to False (0) for real values and True (1) for None, so real values always sort before None. Extend this for multi-key scenarios:

sorted(
    athletes,
    key=lambda a: (
        a.total_lifted is None,
        -(a.total_lifted or 0),
        a.body_weight is None,
        a.body_weight or 0,
    )
)

Sorting a DataFrame

The same logic applies to pandas, with cleaner syntax:

import pandas as pd

df = pd.DataFrame([
    {"name": "Alice",   "total_lifted": 200, "body_weight": 60.0},
    {"name": "Bob",     "total_lifted": 200, "body_weight": 65.0},
    {"name": "Charlie", "total_lifted": 195, "body_weight": 62.0},
    {"name": "Diana",   "total_lifted": 205, "body_weight": 63.0},
])

df.sort_values(
    by=["total_lifted", "body_weight"],
    ascending=[False, True]
).reset_index(drop=True)

ascending accepts a list, one value per column — no negation needed, no stability tricks.

Performance

For large datasets, repeated key function calls can add up. functools.cmp_to_key lets you write a comparator function directly, which can be faster when comparisons are complex — but for most sorting tasks the tuple-key approach is fast enough and far more readable.

The practical threshold: if you're sorting millions of objects and the sort is in a hot path, profile first. Otherwise, keep the tuple approach.

Quick reference

NeedPattern
Multi-key, all ascendingattrgetter('a', 'b')
Multi-key, mixed directions (numeric)lambda x: (-x.a, x.b)
Multi-key, mixed directions (strings)Two-pass stable sort
Handle Nonelambda x: (x is None, x)
DataFramesort_values(by=[...], ascending=[...])