What is memoization and how can I use it in Python?

The Question :

391 people think this question is useful

I just started Python and I’ve got no idea what memoization is and how to use it. Also, may I have a simplified example?

The Question Comments :
  • When the second sentence of the relevant wikipedia article contains the phrase “mutually-recursive descent parsing[1] in a general top-down parsing algorithm[2][3] that accommodates ambiguity and left recursion in polynomial time and space,” I think it is entirely appropriate to ask SO what is going on.
  • @Clueless: That phrase is preceded by “Memoization has also been used in other contexts (and for purposes other than speed gains), such as in”. So it’s just a list of examples (and need not be understood); it’s not part of the explanation of memoization.
  • @StefanGruenwald That link is dead. Can you please find an update?
  • New link to pdf file, since pycogsci.info is down: people.ucsc.edu/~abrsvn/NLTK_parsing_demos.pdf
  • @Clueless, The article actually says “simple mutually-recursive descent parsing[1] in a general top-down parsing algorithm[2][3] that accommodates ambiguity and left recursion in polynomial time and space”. You missed the simple, which obviously makes that example much clearer :).

The Answer 1

364 people think this answer is useful

Memoization effectively refers to remembering (“memoization” → “memorandum” → to be remembered) results of method calls based on the method inputs and then returning the remembered result rather than computing the result again. You can think of it as a cache for method results. For further details, see page 387 for the definition in Introduction To Algorithms (3e), Cormen et al.

A simple example for computing factorials using memoization in Python would be something like this:

factorial_memo = {}
def factorial(k):
    if k < 2: return 1
    if k not in factorial_memo:
        factorial_memo[k] = k * factorial(k-1)
    return factorial_memo[k]

You can get more complicated and encapsulate the memoization process into a class:

class Memoize:
    def __init__(self, f):
        self.f = f
        self.memo = {}
    def __call__(self, *args):
        if not args in self.memo:
            self.memo[args] = self.f(*args)
        #Warning: You may wish to do a deepcopy here if returning objects
        return self.memo[args]


def factorial(k):
    if k < 2: return 1
    return k * factorial(k - 1)

factorial = Memoize(factorial)

A feature known as “decorators” was added in Python 2.4 which allow you to now simply write the following to accomplish the same thing:

def factorial(k):
    if k < 2: return 1
    return k * factorial(k - 1)

The Python Decorator Library has a similar decorator called memoized that is slightly more robust than the Memoize class shown here.

The Answer 2

238 people think this answer is useful

New to Python 3.2 is functools.lru_cache. By default, it only caches the 128 most recently used calls, but you can set the maxsize to None to indicate that the cache should never expire:

import functools

def fib(num):
    if num < 2:
        return num
        return fib(num-1) + fib(num-2)

This function by itself is very slow, try fib(36) and you will have to wait about ten seconds.

Adding lru_cache annotation ensures that if the function has been called recently for a particular value, it will not recompute that value, but use a cached previous result. In this case, it leads to a tremendous speed improvement, while the code is not cluttered with the details of caching.

The Answer 3

61 people think this answer is useful

The other answers cover what it is quite well. I’m not repeating that. Just some points that might be useful to you.

Usually, memoisation is an operation you can apply on any function that computes something (expensive) and returns a value. Because of this, it’s often implemented as a decorator. The implementation is straightforward and it would be something like this

memoised_function = memoise(actual_function)

or expressed as a decorator

def actual_function(arg1, arg2):

The Answer 4

18 people think this answer is useful

Memoization is keeping the results of expensive calculations and returning the cached result rather than continuously recalculating it.

Here’s an example:

def doSomeExpensiveCalculation(self, input):
    if input not in self.cache:
        <do expensive calculation>
        self.cache[input] = result
    return self.cache[input]

A more complete description can be found in the wikipedia entry on memoization.

The Answer 5

15 people think this answer is useful

Let’s not forget the built-in hasattr function, for those who want to hand-craft. That way you can keep the mem cache inside the function definition (as opposed to a global).

def fact(n):
    if not hasattr(fact, 'mem'):
        fact.mem = {1: 1}
    if not n in fact.mem:
        fact.mem[n] = n * fact(n - 1)
    return fact.mem[n]

The Answer 6

15 people think this answer is useful

I’ve found this extremely useful

def memoize(function):
    from functools import wraps

    memo = {}

    def wrapper(*args):
        if args in memo:
            return memo[args]
            rv = function(*args)
            memo[args] = rv
            return rv
    return wrapper

def fibonacci(n):
    if n < 2: return n
    return fibonacci(n - 1) + fibonacci(n - 2)


The Answer 7

6 people think this answer is useful

Memoization is basically saving the results of past operations done with recursive algorithms in order to reduce the need to traverse the recursion tree if the same calculation is required at a later stage.

see http://scriptbucket.wordpress.com/2012/12/11/introduction-to-memoization/

Fibonacci Memoization example in Python:

fibcache = {}
def fib(num):
    if num in fibcache:
        return fibcache[num]
        fibcache[num] = num if num < 2 else fib(num-1) + fib(num-2)
        return fibcache[num]

The Answer 8

5 people think this answer is useful

Memoization is the conversion of functions into data structures. Usually one wants the conversion to occur incrementally and lazily (on demand of a given domain element–or “key”). In lazy functional languages, this lazy conversion can happen automatically, and thus memoization can be implemented without (explicit) side-effects.

The Answer 9

5 people think this answer is useful

Well I should answer the first part first: what’s memoization?

It’s just a method to trade memory for time. Think of Multiplication Table.

Using mutable object as default value in Python is usually considered bad. But if use it wisely, it can actually be useful to implement a memoization.

Here’s an example adapted from http://docs.python.org/2/faq/design.html#why-are-default-values-shared-between-objects

Using a mutable dict in the function definition, the intermediate computed results can be cached (e.g. when calculating factorial(10) after calculate factorial(9), we can reuse all the intermediate results)

def factorial(n, _cache={1:1}):    
        return _cache[n]           
    except IndexError:
        _cache[n] = factorial(n-1)*n
        return _cache[n]

The Answer 10

4 people think this answer is useful

Here is a solution that will work with list or dict type arguments without whining:

def memoize(fn):
    """returns a memoized version of any function that can be called
    with the same list of arguments.
    Usage: foo = memoize(foo)"""

    def handle_item(x):
        if isinstance(x, dict):
            return make_tuple(sorted(x.items()))
        elif hasattr(x, '__iter__'):
            return make_tuple(x)
            return x

    def make_tuple(L):
        return tuple(handle_item(x) for x in L)

    def foo(*args, **kwargs):
        items_cache = make_tuple(sorted(kwargs.items()))
        args_cache = make_tuple(args)
        if (args_cache, items_cache) not in foo.past_calls:
            foo.past_calls[(args_cache, items_cache)] = fn(*args,**kwargs)
        return foo.past_calls[(args_cache, items_cache)]
    foo.past_calls = {}
    foo.__name__ = 'memoized_' + fn.__name__
    return foo

Note that this approach can be naturally extended to any object by implementing your own hash function as a special case in handle_item. For example, to make this approach work for a function that takes a set as an input argument, you could add to handle_item:

if is_instance(x, set):
    return make_tuple(sorted(list(x)))

The Answer 11

3 people think this answer is useful

Solution that works with both positional and keyword arguments independently of order in which keyword args were passed (using inspect.getargspec):

import inspect
import functools

def memoize(fn):
    cache = fn.cache = {}
    def memoizer(*args, **kwargs):
        kwargs.update(dict(zip(inspect.getargspec(fn).args, args)))
        key = tuple(kwargs.get(k, None) for k in inspect.getargspec(fn).args)
        if key not in cache:
            cache[key] = fn(**kwargs)
        return cache[key]
    return memoizer

Similar question: Identifying equivalent varargs function calls for memoization in Python

The Answer 12

2 people think this answer is useful
cache = {}
def fib(n):
    if n <= 1:
        return n
        if n not in cache:
            cache[n] = fib(n-1) + fib(n-2)
        return cache[n]

The Answer 13

2 people think this answer is useful

Just wanted to add to the answers already provided, the Python decorator library has some simple yet useful implementations that can also memoize “unhashable types”, unlike functools.lru_cache.

Add a Comment