Saving 9 GB of RAM with Python __slots__

November 2013

Original article on tech.oyster.com

We’ve mentioned before how Oyster.com’s Python-based web servers cache huge amounts of static content in huge Python dicts (hash tables). Well, we recently saved over 2 GB in each of four 6 GB server processes with a single line of code — using __slots__ on our Image class.

Here’s a screenshot of RAM usage before and after deploying this change on one of our servers:

Screenshot of RAM usage before and after

We allocate about a million instances of a class like the following:

class Image(object):
    def __init__(self, id, caption, url):
        self.id = id
        self.caption = caption
        self.url = url
        self._setup()

    # ... other methods ...

By default Python uses a dict to store an object’s instance attributes. Which is usually fine, and it allows fully dynamic things like setting arbitrary new attributes at runtime.

However, for small classes that have a few fixed attributes known at “compile time”, the dict is a waste of RAM, and this makes a real difference when you’re creating a million of them. You can tell Python not to use a dict, and only allocate space for a fixed set of attributes, by settings __slots__ on the class to a fixed list of attribute names:

class Image(object):
    __slots__ = ['id', 'caption', 'url']

    def __init__(self, id, caption, url):
        self.id = id
        self.caption = caption
        self.url = url
        self._setup()

    # ... other methods ...

Note that you can also use collections.namedtuple, which allows attribute access, but only takes the space of a tuple, so it’s similar to using __slots__ on a class. However, to me it always feels weird to inherit from a namedtuple class. Also, if you want a custom initializer you have to override __new__ rather than __init__.

Warning: Don’t prematurely optimize and use this everywhere! It’s not great for code maintenance, and it really only saves you when you have thousands of instances.