These are my notes on publishing a Python module on PyPI in 2018 using C++17, Boost and Swig.
Note This is indeed my first Python rodeo! The primary target audience is ‘Me in 6 Months’, so YMMV. But hopefully, it may be of use to anyone going down a similar route of using C++17, Boost and Swig to build and package a Python module.
is my little library that has (simple) language bindings for
i.e. you are able to use Hext within a Python project.
If a user were to install the Debian-packages,
Hext’s dependencies would automatically be installed through Apt.
All the required libraries would automatically be copied into memory by
the dynamic linker and be available to the application at runtime.
import hext instructs the Python interpreter to load
which loads the hext python module _hext.so,
the "glue" between Python and Hext.
This dynamically shared object depends on other libraries, which are satisfied through the dynamic linker.
A YOLO approach to dependency management
Unfortunately, we cannot expect the target system to have Gumbo and Boost installed, nor a standard library for C++17.
That leaves us with only one option: Producing a Python module that includes all (most) dependencies by linking statically.
Linking statically brings its own bag of problems, especially since security updates require recompilation and redistribution.
This is also known as the YOLO method, because ‘You Only Link Once’ :)
A Build Environment for Python Modules
Binary modules and system compatibility
To be binary compatible with most systems, it is recommended to use CentOS 5 as a build target (PEP-0513).
CentOS 5 was first released in 2007, and is End-Of-Life since March 2017.
Additionally, there’s a GCC 4.8.2, which is quite old, dare I say :)
This means we need to either compile our own GCC, or use somebody else’s precompiled GCC for Cent OS 5.
Fortunately, compiling the most recent GCC (8.2.0 in my case) on Cent OS 5 is straight-forward.
With the new GCC toolchain in place, building all other dependencies is a non-issue.
CMake provides binary releases for linux-x86_64, but those require Glibc 2.6,
which is not available on Cent OS 5 (which is stuck with 2.4).
An alternative is to install a precompiled version of CMake through pip.
Building A Python Module
We now have everything setup to actually build the Python Module.
Make sure to add the include path of the Python version you want to build against:
Stripping your Module
Remove all the excess leftovers from static linking to considerably reduce the filesize of your module.
Use ldd to list your module’s dependencies on dynamically shared objects:
In the above example, _mymodule.so only depends on very old versions of libm and libc,
and is therefore compatible with most linux-based systems.
You can ignore linux-vdso.so
and ld-linux-x86-64.so (ld.so).
A list of libraries which you can depend on
when building Python modules for the manylinux1 platform tag is outlined
You can get more verbose output by invoking the dynamic linker directly
and setting some environment variables:
Static libstdc++ and libgcc
When building your Python Module it is important to tell GCC to statically link libstdc++ and libgcc,
i.e. -static-libgcc-static-libstdc++, as to not accidentally introduce a dependency on your toolchain.
CMake & Static Libraries
If CMake has trouble picking up the static version of libraries,
experiment with the following CMake flags:
-DCMAKE_FIND_LIBRARY_SUFFIXES=.a and -DBoost_USE_STATIC_LIBS=On.
Swig and Python 2 Unicode
If you are using Swig and building a module for Python 2 which accepts strings passed from Python to C++,
make sure to add the following to your interface file:
An easy way to test your Python interface on whether it accepts Unicode strings:
If there’s a TypeError thrown, like the following, the Python Unicode string is not accepted
as an argument for a parameter of type std::string:
Don’t link libpythonX
Linking libpython is neither neccessary nor recommended.
Modules and __init__.py
When the python interpreter encounters the following line:
It will try to find a directory called mymodule that contains a file with the special name __init__.py,
which is then executed:
In other words, __init__.py is responsible for loading the shared library and setting up the module.
Swig and __init__.py
Swig generates a mymodule.py to be used as a loader for the compiled module _mymodule.so.
Unfortunately, you cannot just rename this file to __init__.py and be done with it.
I am not sure,
but it seems that the generated Python script isn’t supposed to be used as an __init__.py:
The easiest way to load a shared library residing in the same directory of __init__.py,
which works with Python ≥ 2.7:
I am using the following bash script to automatically replace Swig’s loader with the above line:
Packaging a precompiled module for PyPI
Precompiled modules and executables are uploaded to PyPI in the form of
For example, the filename of my wheel hext-0.2.0-cp37-cp37m-manylinux1_x86_64.whl tells us:
hext-0.2.0: The package provided is hext, in version 0.2.0
cp37: For Python version 3.7
cp37m: Linked against the Python 3.7 Application Binary Interface (for example, Python 2.7 has ABIs for different Unicode string types)
manylinux1: It is compatible with systems that fulfill the manylinux1 platform tag
x86_64: Expected system architecture
You can upload as many wheels as you want.
The user’s package manager (pip) will choose the appropriate wheel for the user’s environment.
Packaging Python modules is done through
and a setup file traditionally named setup.py.
Example project layout:
setup.py might look like this:
If all the pieces are in the right place, you can now create a wheel:
So this is the wheel: mymodule-0.0.1-cp27-cp27m-linux_x86_64.whl.
Notice how it says -linux instead of -manylinux1.
This is because setuptools
cannot tell which subset of linux-systems
this wheel might be compatible with.
Fortunately, renaming the wheel is enough, i.e. replace linux with manylinux1:
Publishing Wheels on pypi.org
Now the only task left is to create an account on pypi.org and
to finally publish your wheel.
The recommended way to upload wheels is via twine:
The complete setup.py for hext
As an example, this is the setup.py I used for packaging hext v0.2.0.
Note that Hext also includes a command-line utility called htmlext.