Decorator-Based Caching in Python: From In-Memory to Persistent

If you're iterating on a script that hits an external API or runs a slow computation, every re-run means waiting. A caching decorator wraps any function and short-circuits repeated calls with the stored result — no changes to call sites, no manual cache management.

How Python decorators work

A decorator is a function that takes another function and returns a modified version of it. The @ syntax is just shorthand for reassigning the function name:

def my_decorator(func):
    def wrapper():
        print("before")
        func()
        print("after")
    return wrapper

@my_decorator
def say_hello():
    print("Hello!")

# Equivalent to: say_hello = my_decorator(say_hello)
say_hello()

Output:

before
Hello!
after

The decorator replaces say_hello with wrapper, which calls the original function internally. This is the hook we'll use to intercept calls and serve cached results.

Basic in-memory cache

from functools import wraps

def cached(func):
    cache = {}

    @wraps(func)
    def wrapper(*args, **kwargs):
        key = (args, frozenset(kwargs.items()))
        if key in cache:
            return cache[key]
        result = func(*args, **kwargs)
        cache[key] = result
        return result

    return wrapper

The @wraps(func) line preserves the original function's __name__ and __doc__ — without it, every decorated function would appear as wrapper in tracebacks and help() output.

The cache key is a tuple of (args, frozenset(kwargs)). frozenset makes kwargs hashable so they can be used as a dict key regardless of the order they were passed.

@cached
def slow_function(a, b):
    import time
    time.sleep(5)
    return a + b

slow_function(2, 3)  # waits 5 seconds → returns 5
slow_function(2, 3)  # returns 5 instantly
slow_function(2, 4)  # waits 5 seconds → returns 6 (different key)

The cache lives in the wrapper closure, so it persists for the lifetime of the process but disappears when the script exits.

File-based cache for persistence

For scripts you run repeatedly — data pipelines, report generators, anything with expensive upstream calls — you want the cache to survive between runs:

import os
import pickle
import hashlib
import json
from pathlib import Path
from functools import wraps
from json import JSONEncoder

class FallbackEncoder(JSONEncoder):
    """Handles non-JSON-serializable args by falling back to the class name."""
    def default(self, o):
        return o.__class__.__name__

def file_cached(func):
    cache_dir = Path(".cache")

    @wraps(func)
    def wrapper(*args, **kwargs):
        args_str = json.dumps({"args": args, **kwargs}, cls=FallbackEncoder, sort_keys=True)
        digest = hashlib.sha1(args_str.encode()).hexdigest()[:12]
        cache_file = cache_dir / f"{func.__name__}_{digest}"

        if cache_file.exists():
            with open(cache_file, "rb") as f:
                return pickle.load(f)

        result = func(*args, **kwargs)
        cache_dir.mkdir(parents=True, exist_ok=True)
        with open(cache_file, "wb") as f:
            pickle.dump(result, f)
        return result

    return wrapper

A few design choices worth noting:

SHA-1 digest as filename — serializing args to JSON and hashing them gives a stable, filesystem-safe key for any argument combination. A 12-character prefix of the hex digest is collision-resistant enough for local caching.
pickle for storage — pickle handles arbitrary Python objects (dataframes, custom classes, numpy arrays) that JSON can't. For untrusted inputs pickle is unsafe, but for your own script's output it's fine.
FallbackEncoder — if an argument isn't JSON-serializable, falling back to the class name means the key degrades gracefully rather than raising. You lose key uniqueness for complex objects, but it's better than crashing.

@file_cached
def fetch_report(date: str, region: str):
    # expensive API call
    return api.get_report(date=date, region=region)

fetch_report("2024-01-01", "EMEA")  # fetches and writes to .cache/
fetch_report("2024-01-01", "EMEA")  # reads from .cache/ — instant

Making the cache directory configurable

Wrapping the decorator in another function lets you pass configuration:

def cached(folder=".cache"):
    def decorator(func):
        cache_dir = Path(folder)

        @wraps(func)
        def wrapper(*args, **kwargs):
            args_str = json.dumps({"args": args, **kwargs}, cls=FallbackEncoder, sort_keys=True)
            digest = hashlib.sha1(args_str.encode()).hexdigest()[:12]
            cache_file = cache_dir / f"{func.__name__}_{digest}"

            if cache_file.exists():
                with open(cache_file, "rb") as f:
                    return pickle.load(f)

            result = func(*args, **kwargs)
            cache_dir.mkdir(parents=True, exist_ok=True)
            with open(cache_file, "wb") as f:
                pickle.dump(result, f)
            return result

        return wrapper
    return decorator

Usage with and without arguments:

@cached()                           # uses default folder ".cache"
def default_cached_fn(x):
    ...

@cached(folder="data/.cache")       # custom folder
def custom_cached_fn(x):
    ...

When to use this vs `functools.lru_cache`

If you only need in-memory caching, Python's built-in functools.lru_cache is better than rolling your own — it's implemented in C and handles LRU eviction:

from functools import lru_cache

@lru_cache(maxsize=128)
def expensive(n):
    ...

The file-based approach is useful when:

Results need to survive process restarts
Computations take minutes rather than seconds (so the pickle I/O overhead is irrelevant)
You're iterating on downstream logic and don't want to re-fetch upstream data every run

How Python decorators work

Basic in-memory cache

File-based cache for persistence

Making the cache directory configurable

When to use this vs functools.lru_cache

When to use this vs `functools.lru_cache`