Kyle Edwards

High Performance Python

Below are a few notes collected while reading High Performance Python by Micha Gorelick and Ian Ozsvald.

Lists and Tuples

Understanding data structures is key to writing performance programs. In python, arrays are classified as either lists or tuples. Lists are mutable while tuples are static. Arrays in python are in reality arrays of pointers, which comes at a cost, but allows lists to contain complex Python objects and support dynamic typing.

Providing the __eq__ and __lt__ dunder methods and using functools.total_ordering allows you to create custom data types that support sorting.

Python’s native sorting algorithm is called “Tim sort,” which is a hybrid of insertion sort and merge sort that uses heuristics to analyze the data in the list to determine which strategy will be more performant.

The bisect module provides a lot of helpful methods around inserting into and sorting pre-sorted lists, which may be a useful optimization to make.

Tuples are cached by the Python runtime and don’t always have to talk to the kernel to request memory, which can provide a performance boost. Once a tuple of a certain size is created and goes out of scope, that memory is not immediately returned to the OS. This can provide a significant gain if your program tends to use tuples of the same size repeatedly.

Pick the right data structure first. Using a suboptimal structure will most likely outweigh any performance gains from small tweaks.

Comprehensions are always more performant than building up a new list by appending to it. On append, lists can potentially overallocate memory to make room for future appends. This doesn’t happen for list creation through comprehensions, which can save memory and provide increased computation speed.

Dictionaries and Sets

Dictionaries and sets are extremely dependent on their hash functions. If a hash function has a poor load factor leading to many cache collisions, it could take a considerable amount of time to calculate the correct bucket for keys.

User-defined types can implement the __hash__ and __cmp__ dunder methods to allow them to be used as keys in dicts and sets. This seems particularly useful for sets as you can test for uniqueness.

Namespaces

When searching the scopes for a variable, function, or module, Python first looks at the locals() scope, before going through the globals() and finally __builtin__. Additionally, searching through a module for a given property also incurs a dictionary lookup. Because of this, it’s important to explicitly import those members. If extreme performance is needed, a hack is to supply the member as a default property of a function, placing it in the local scope. It’s not quite as readable, but if you’re iterating that method millions of time, it could improve performance.

Generators

Generators are functions that yield many results using lazy evaluation, and don’t just return a single result at the end of the code’s execution. Primarily, generators are used when you need to calculate values without storing the result. An example of this is within Python’s built-in range() function. Generators are iterable out of the box, so they can be used in for/in blocks. When generators are done yielding values, they throw a StopIteration exception.

It’s very important to acknowledge the tradeoff between memory and CPU when using generators over lists. Because of this, be wary of implicit transformations between the two, which will happen when using a generator within a list comprehension (use a generator comprehension instead!).

Generators allow for infinite series, which if used correctly, are a powerful tool combined with compositional functions like those found in itertools.

Performance Profiling

Below are some recommendations for tools to profile your Python applications and to dig into low-level bytecode:

A recommended initial strategy for profiling your applications is to use cProfile to determine which functions take up a lot of computation time, and then use list_profiler to dig in and analyze it line-by-line.