Tremendously speed up Python code with Cython and package it with Poetry

Egor BlagovEgor Blagov
7 min read

Today we are going to make a C++ extension for Python code, wrap it and embed easily with Cython and then make it installable via Poetry. This article is a continuation of my previous article (you can find it here) about implementing a Trie data structure, since I’ve promised you to make the trie faster. Let’s start!

Brief on Cython

Cython is an optimizing static compiler for both Python and the extended Cython language. You can find more info on the official website https://cython.org/. I came up with a meme, describing what Cython is:

(Screenshot from video by Allen Pan)

With Cython you can write Python-like code with typing and Cython features, and then convert it into C++ code, which then can be compiled. We want some speed, so let’s go for it.

Making an extension

Cython helps easily to wrap C++ classes and embed them to Python, and coincidentally it’s exactly what we need. The class interface (Trie.h):

#ifndef TRIE_H
#define TRIE_H

#include <string>
#include <list>
#include <unordered_map>

class TrieNode
{
private:
    wchar_t letter;
    int insertion_count;
    std::unordered_map<wchar_t, TrieNode *> children;
    TrieNode *parent;
    std::wstring word;

    std::wstring bottom_up_traversal() const;

    std::list<const TrieNode *> word_nodes() const;
    inline std::wstring convert(const std::string &word) const;
    inline std::string reconvert(const std::wstring &word) const;
    inline std::list<std::string> unwrap_words(const std::list<const TrieNode *> &nodes) const;

public:
    TrieNode(wchar_t letter, TrieNode *parent = nullptr);
    ~TrieNode();
    void insert(const std::string &word, int multiplier = 1);
    int count(const std::string &word) const;
    std::list<std::string> complete(const std::string &prefix) const;
    std::list<std::string> ambiguous_complete(const std::list<std::string> &prefix_groups) const;
};

class Trie : public TrieNode
{
public:
    Trie();
    void extend(const std::list<std::string> &words, std::list<int> multipliers);
};

#endif

Note the differences from Python implementation:

  1. I’ve made two methods for complete: complete and ambiguous_complete. One is made for completing one prefix, and another is made to complete in a T9-like manner (have a look at previous article).

  2. I use wstring to make the trie capable of handling non-ascii letters, e.g. Russian or Japan. Thus we need to know how to convert string (which will contain bytes in case of Cython) into wstring and backwards.

You can find the implementation (Trie.cpp) at my GitHub (2 more commits, to my stats, yay!)

Embed C++ into Python

In Cython it’s done with .pxd files. We just need to declare the interface, in our case the interface looks so (Trie.pxd):

# distutils: language = c++

from libcpp.string cimport string
from libcpp.list cimport list

cdef extern from "Trie.cpp":
    pass

cdef extern from "Trie.h":
    cdef cppclass Trie:
        Trie() except +
        void insert(string, int)
        int count(string) const
        list[string] complete(const string) const
        list[string] ambiguous_complete(const list[string]) const
        void extend(const list[string] words, list[int] multipliers)

except + alongside the constructor is needed for the proper handling of exception by Python. For the other points the declaration just follows the trie header and uses C++ std classes for types specifying.

After that I’ve made a Python wrapper of a C++ class, and it looks this way (cytrie.pyx):

# cython: language_level=3

from Trie cimport Trie

cdef class CyTrie:
    cdef Trie c_trie;

    def __cinit__(self):
        self.c_trie = Trie()

    def insert(self, word: str, multiplier: int = 1):
        word_bytes = word.encode('utf8')
        self.c_trie.insert(word_bytes, multiplier)

    def count(self, word: str) -> int:
        word_bytes = word.encode('utf8')
        return self.c_trie.count(word_bytes)

    def complete(self, word: 'str | list[str]') -> 'Iterable[str]':
        if isinstance(word, str):
            word_bytes = word.encode('utf8')
            results = self.c_trie.complete(word_bytes)
        else:
            results = self.c_trie.ambiguous_complete([w.encode('utf8') for w in word])

        yield from [w.decode() for w in results]

    def extend(self, words: list[str], multipliers: 'list[int] | None' = None):
        words_bytes = [word.encode('utf8') for word in words]
        if multipliers is None:
            self.c_trie.extend(words_bytes, [1 for _ in range(len(words_bytes))])
        else:
            self.c_trie.extend(words_bytes, multipliers)

    def __iter__(self) -> "Iterable[str]":
        return iter(self.complete(""))

    def __contains__(self, word: str) -> bool:
        return bool(self.count(word))

    def __len__(self) -> int:
        return len(list(self.complete("")))
  1. # cython: language_level=3 means it’s Python 3 (not Python 2).

  2. Other methods just use Python features to make it simpler to implement (single complete method, extend with optional multipliers argument and so on.

The directory structure, is trivial, I’ve placed everything to the same directory, and we have these files:

  1. Trie.cpp — C++ implementation

  2. Trie.h — C++ header

  3. Trie.pxd — Cython declaration of a C++ class

  4. cytrie.pyx — Cython/Python wrapper of a declared C++ class

  5. __init__.py

Ok, we’re good to go and compile our code, to try it out we can use cythonize command, available with Cython:

#!/bin/bash

CFLAGS='-std=c++17' cythonize -i cytrie.pyx

This command will convert Cython declarations into a single cytrie.cpp file with C++ code, that is bundled to a library file then. The result can be imported into Python and used:

>>> import cytrie
>>> dir(cytrie)
['CyTrie', '__builtins__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__test__']
>>> x = cytrie.CyTrie()
>>> x.extend('please help I am a hostage of my own perfectionism'.split(' '))
>>> list(x.complete('p'))
['please', 'perfectionism']

The code can be used already, though we have several things left.

Benchmark

The benchmark I have consists of words taken from Shakespeare’s texts. The benchmark tries to complete all prefixes of each word in a vocabulary:

for word in words:
    for i in range(1, len(word)):
        trie.complete(word[:i])

Now let’s start our Python implementation and Cython/C++ implementation and compare the results:

TL;DR: C++ implementation is 1400 times faster than Python one

Brief on Poetry

Now let’s go over another important topic — packaging. At first, when I just started my experiments, I didn’t pay enough attention to it, but then it came back to me, so I’ll walk you through the packaging of our Cython code using Poetry.

Poetry is a Python packaging and dependency management tool. With Poetry making a package is straightforward (until it comes to Cython compilation, heh). Poetry uses pyproject.toml for managing settings and dependencies. More info about Poetry can be found here https://python-poetry.org/. Poetry for Python helps to keep everything clean.

How to use Cython with Poetry

To make Cython work with Poetry we must somehow tell Poetry to run cythonize on our code during the installation, after that we should compile our .cpp files into a bundle. To make it happen we will use an undocumented feature of poetry -- we need to add these options to the pyproject.yaml:

[build-system]
requires = ["poetry-core", "Cython"]
build-backend = "poetry.core.masonry.api"

[tool.poetry.build]
generate-setup-file = false
script = 'build.py'

build-system tells what we need to install our package. Since we have C++ code, we either should compile it for every possible platform and publish it as a wheel, or we can publish it as an sdist. And our users will have it compiled automatically on installation.

As build dependencies of course we need cython because it has cythonize method we need, and poetry-core to make our poetry.core.masonry.api build backend available.

In tool.poetry.build section we specify the name of our build script. The script will be executed on client’s machine when we need to build an sdist and make a wheel. We don’t need to reinvent the wheel though.

I’ve made a minimal setup for you, thus the build.py looks so:

import os
import shutil
from distutils.core import Distribution, Extension

from Cython.Build import build_ext, cythonize

cython_dir = os.path.join("trie_again", "_ext")
extension = Extension(
    "trie_again.cytrie",
    [
        os.path.join(cython_dir, "cytrie.pyx"),
    ],
    extra_compile_args=["-O3", "-std=c++17"],
)

ext_modules = cythonize([extension], include_path=[cython_dir])
dist = Distribution({"ext_modules": ext_modules})
cmd = build_ext(dist)
cmd.ensure_finalized()
cmd.run()

for output in cmd.get_outputs():
    relative_extension = os.path.relpath(output, cmd.build_lib)
    shutil.copyfile(output, relative_extension)
  1. We have our .pxd, .pyx, .cpp files in a subdirectory, hence I specify it.

  2. An Extension object usage is just a regular way of specifying a C++ extension for Python.

  3. We use cythonize to make a .cpp file with our extension.

  4. We use a Distribution object to pass our code into a build_ext command.

  5. build_ext command is used to bundle our .cpp into a library file.

  6. Lastly, we copy our results (library bundle) into the installed directory at the place we specified in the Extension.

When we work in our development environment, each time we change our C++ code we have to do poetry install. This will cythonize our code and compile it. To make a distribution we should do poetry build -f sdist. And then we can publish our package to the PyPi.

Conclusion

As we see C++ module makes our code run hyper-fast and it’s easy the extension within Python. Though in some cases you might not need C++ at all, Cython with its syntax already can generate C/C++ code that covers a huge amount of scenarios, so it’s worth investigating.

Hope everything looks clear and simple, but I’ve spent several days trying to figure it out.

Good hacking!

Gently~Neito Monoma X OC | Snake, Cute reptiles, Pet snake

11
Subscribe to my newsletter

Read articles from Egor Blagov directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Egor Blagov
Egor Blagov

Senior Software Engineer, python learner, C++ lover, hardware enthusiast