Python script dependencies

03Oct24

TL;DR

‘–break-system-packages’ sounds scary, but (after some careful evaluation) is likely to be the right way to go for infrastructure automation, at least until uv is ready for production. Python venvs seem to be what we’re expected to use, but introduce additional complexity and associated fragility, which seems to make them a poor choice for system scripts.

Background

At Atsign we use a fair few Python scripts for infrastructure automation, and they’re built on top of dependencies that don’t come installed by default. This wasn’t previously a problem. We could ‘pip install’ what we needed (or in reality ‘pip install -r requirements.txt’ as we’re not savages, and actually keep track of dependencies).

But… things aren’t so simple in a post Debian 12 world, which includes Ubuntu 24.04 and Raspberry Pi OS ‘Bookworm’. The operating system itself uses Python, and to protect those scripts from being broken that’s now part of an ‘externally-managed-environment’. Some additional packages can be installed using ‘apt install python3-whatever’, but that’s a (very) limited subset. We use a couple of the google-cloud packages, and they’re not available that way. Attempting to install a package with ‘pip’ results in a dire warning:

error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.

    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.

    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.

    See /usr/share/doc/python3.12/README.venv for more information.

note: If you believe this is a mistake, please contact your Python
installation or OS distribution provider. You can override this, at
the risk of breaking your Python installation or OS, by passing
--break-system-packages.
hint: See PEP 668 for the detailed specification.

What should we do?

I’m writing this post because even though Debian 12 has been out for over a year, I’ve not been able to find any expert guidance on the topic. It feels like the elephant in the room is being studiously ignored.

Just use a venv – right?

I must confess that I was a virtual environment (venv) Luddite until the Debian 12 change, and that change has pushed me into wholesale adoption of venvs. But there’s a huge difference between stuff we do in dev, and what a prod environment should look like.

There’s also a difference between Python apps and Python scripts (for infrastructure).

As I see it there are two problems in using venvs for production scripts:

  1. The venv needs to be created and populated (on every machine where the scripts will run). Yeah, this is just a bunch more scripting, but that’s more toil, more tech debt, more stuff that can go wrong.
  2. Every script needs to be modified to call the venv rather than the usual ‘#!/bin/env python3’.

There’s a part of me that thinks if venvs are so damn good why didn’t the systems people build one for themselves, and leave the rest of us to muck up our default namespace like before?

uv to the rescue?

I asked about this problem on Mastodon, and Python Software Foundation (PSF) board member Simon Willison was kind enough to reply and point out that uv has some handy features that might be relevant.

Firstly I should say that uv has already become an integral part of how I use venvs – it’s speed, flexibility and simplicity are delightful (and so much better than the stock Python tools).

I really like what uv are doing with scripts and dependencies. Dependencies are explicitly declared in a chunk of metadata (inside a /// script tag) at the top of a script, and then ‘uv run’ takes care of making sure stuff is in place. I’ve made a habit of dropping a comment into scripts with any ‘pip install’ lines that are needed, but this takes it to the next level. Of course it’s still necessary to change the shebang to use ‘uv run’ (specifically #!/bin/env uv run -q), but I think that’s a fair trade for explicit dependency management in situ.

The only thing stopping me switching to this approach is that uv is presently at version 0.4.18 and that 0 at the start says “don’t use this in production (yet)”.

Just –break-system-packages then?

First, I should say that wherever possible system Python packages should be installed with ‘apt install python3-whatever’ and that works for things like dnspython and dotenv that we use a fair bit.

But I’m writing this because that isn’t always an option.

But also ‘–break-system-packages’ doesn’t actually do what it says. It’s more of a warning of potential breakage than a certainty that breakage will occur. The key consideration is whether a package installed that way causes any actual changes to system dependencies, and if so are they going to cause actual breakage.

For the packages I’ve looked at: google-cloud-compute and google-cloud-pubsub it seems that they’re pretty well behaved. If I create a venv that inherits system packages, and then install those packages on top they happily work with what’s already there rather than stomping on it. Here’s a diff of ‘pip list’ afterwards:

8a9
> cachetools                         5.5.0
18a20
> Deprecated                         1.2.14
21a24,31
> google-api-core                    2.20.0
> google-auth                        2.35.0
> google-cloud-compute               1.19.2
> google-cloud-pubsub                2.25.2
> googleapis-common-protos           1.65.0
> grpc-google-iam-v1                 0.13.1
> grpcio                             1.66.2
> grpcio-status                      1.66.2
24a35
> importlib_metadata                 8.4.0
38a50,52
> opentelemetry-api                  1.27.0
> opentelemetry-sdk                  1.27.0
> opentelemetry-semantic-conventions 0.48b0
40a55,56
> proto-plus                         1.24.0
> protobuf                           5.28.2
59a76
> rsa                                4.9
72a90,91
> wrapt                              1.16.0
> zipp                               3.20.2

There’s probably one or two things (e.g. rsa) that I could have got from apt. But as they’re not there already the core OS clearly isn’t using them (and so won’t break because of getting them with pip rather than apt).

So… we decided to go with installing what we can’t get from apt with pip and ‘–break-system-packages’ – at least for now, until uv matures to 1.0.0.



No Responses Yet to “Python script dependencies”

  1. Leave a Comment

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.