Memoization: an introduction using Python
Memoization is a classic optimization technique that can speed up a function by having it “remember” inputs it previously computed. If a memoized function is called with inputs it remembers, it returns the previously computed value. If the function is computationally expensive, and the function may be called multiple times with the same input, this simple technique can be an easy way to speed up a function without adjusting the underlying algorithm.
I consider this type of optimization “classic” because it involves two time-honored computer science topics: caching and time-space tradeoffs.
Caching is the process of saving copies of data “closer” to the mechanism that’s doing a computation or making a request for data. Caches are absolutely everywhere in technology:
The concept of caching is simple: if you have reason to suspect that some pieces of data will be required more than once, you can store copies “closer” to the entity that needs it.
Time-space tradeoffs involve the general observation that algorithms can be optimized for maximum time efficiency (fastest speed) or maximum space efficiency (lowest memory usage), but usually not both. If you are working in an environment with particular constraints, it’s often possible to adjust an algorithm to save time by using more space, or save space by spending more time.
Memoization, however, is an approach to optimizing functions that doesn’t involve changing the function at all. It’s a higher-level technique akin to caching, though its origins can be traced back to time-space tradeoffs by way of computational complexity theory.
You asked for it:
If this function looks simple, it’s because it is. We’ll see, however, that it won’t work for some functions. In particular, it won’t work for recursive functions, functions that take more than one argument, and functions whose arguments aren’t hashable.
However, for simple functions, this works, and presents the basic
approach to memoization well. This
memoized() function takes the
function you want to memoize, and returns a wrapped up
version of the function that has been memoized:
You can then call the
myfunc_faster() function as you would normally.
However, you’ll notice that, after you’ve call the memoized function once
with a particular argument, subsequent calls with that argument come back
memoized() function creates a dictionary
memo that will map values
of arguments to the function
func to the return values that
produce with those arguments. (This is why arguments have to be hashable —
they must be stored in this dictionary.)
Then the function defines another function,
f(). If you’ve never seen a
nested function definition before, it is done precisely like a normal
function definition. However, this
f() is particularly special because
it’s an example of a closure. A closure is just a function that holds
some data that was outside the function at the time of its definition.
When the closure is called at some later time, it “remembers” the data,
or in this case, “remembers” the reference to the dictionary
You might notice that, when the
memoized() function returns, the
variable goes away — but since the closure still has a reference to the
dictionary, it doesn’t disappear forever. Instead, it can be used whenever
the closure is called. Similarly, the closure retains a reference to the
function that is supposed to be memoized. This allows us to “intercept”
In the closure we first check whether the given
arg is in the dictionary. If it is, the closure doesn’t have to
func() at all — it just returns the previously computed
If the argument is not in the dictionary,
actually called, and the closure stores its result in the dictionary. Then
the closure returns the value returned by
func(). To the
caller of the memoized function, it seems indistinguishable from calling
the original function (except, perhaps, that the memoized version may
Here’s an example of a function I wrote specifically to be as inefficient as possible — it’s an implementation of insertion sort that sorts a string. (If you’ve never heard of insertion sort, that’s okay — this is just one algorithm that takes unsorted data and returns the same data in sorted order.)
There are a couple of reasons why this
insertion_sort() function is bad;
I won’t go into detail here, but its helper function
swapped() is very
memory inefficient, which makes
insertion_sort() slow. For the purposes of
learning about memoization, imagine you did not or could
not change the definition of this function to optimize it directly — instead,
you’ll want to use memoization to speed up duplicate calls
Then I’ll come up with a very large string that chokes up
If you copy this code into a Python file or drop it into a Python shell, you can execute the following call:
On my MacBook, this call takes about three or four seconds to finish. The result is a large string of characters, all in order. When we run the memoized version for the first time, we’ll get the same performance:
However, if you run the
sort_faster(huge_str) call again, you’ll find
that it returns immediately with the result! This is because the closure
checked the dictionary and found that the argument
huge_str had already
been computed, and, instead of calling
insertion_sort(), it returned
the data from the dictionary.
With memoization, we saw that we could optimize a function without
changing the underlying algorithm by using a caching technique.
In particular, we employed a closure (our function
f() from above)
that “intercepted” calls to an underlying function and checked whether
the function’s return value was already known.
If the return value was already computed, we reduced the time taken to produce that value to the time it takes to access the data from the dictionary. Since Python dictionaries are very efficient, this resulted in a huge speedup, at least with my bad implementation of insertion sort.
If you’re looking for more complicated extensions for this, you might
try getting this approach to work with functions that take more than
one argument. (You might be able to collect them into a tuple and
use that as a key to the dictionary.) You could also make this
work with recursion and potentially speed up a naive recursive
fibonacci(), the function that generates a particular
Fibonacci number. (This is a classic application of memoization.)
I hope you enjoyed this short introduction to memoization, and don’t
hesitate to ask me questions or send suggestions. I can be reached at