altair icon indicating copy to clipboard operation
altair copied to clipboard

Switch from hatch to uv for package and environment management

Open binste opened this issue 1 year ago • 7 comments

What is your suggestion?

uv is now a Python packaging manager https://astral.sh/blog/uv-unified-python-packaging. For a while already it had a lot of useful functionality but now it has all the features we'd need to replace hatch. Paraphrasing @mattijn (correct me if I missed anything!), switching to uv might help us attract new developers. It's very likely that uv is much more popular than hatch, judging based on GitHub stars but also development activity and feature set. Due to this:

  • It makes it easier for new Altair contributors to get started as they might already know it from other projects
  • Or if a new contributor has not yet used it, it's likely that they are more interested in learning it

Have you considered any alternative solutions?

Poetry would be an alternative. uv seems closer to hatch so less of a switching cost and it just has a super fast solver and we have it already installed for the linting and formatting

binste avatar Sep 06 '24 14:09 binste

FYI we're already using uv for dependencies.

I'd support switching over fully if there was no loss of functionality.

A first step would be trying to replace pip and hatch in all the GitHub workflows. They now have proper documentation - which was missing when I tried that out before

dangotbanned avatar Sep 06 '24 14:09 dangotbanned

I'd support switching over fully if there was no loss of functionality.

I'm also on board with this

joelostblom avatar Sep 06 '24 15:09 joelostblom

I think moving to uv sounds like a fine direction. One thing to throw out there is that I'm using pixi for VegaFusion, and will soon be using it for vl-convert. Pixi is a very similar idea, but has a few advantages for certain kinds of projects.

  • It can pull dependencies from PyPI using en embedded copy of uv, so this is just as fast as plain uv
  • It can also pull dependencies from conda-forge. This is really useful for non-python development dependencies. For example, in VegaFusion I configure Pixi to install Rust, Java, and Node.js from conda-forge so that these don't need to be installed at the system level.
  • Pixi includes a cross-platform shell (based on Deno's shell) that it uses to evaluate tasks. This shell includes a bunch of basic unix commands like (cp, rm, pwd, cat, etc.) and it has access to any CLI dependencies installed from conda-forge. So you can write tasks as if you're on linux, and they work the same on Windows.
  • Pixi tasks can depend on other tasks, and they support hash-based caching to intelligently skip running the parent tasks when not needed.

Altair development doesn't require any non-Python dependencies, so the advantage of Pixi over uv isn't that strong, but wanted to mention it as an alternative to at least consider.

jonmmease avatar Sep 07 '24 11:09 jonmmease

@jonmmease I'm quite keen to use pixi in vl_convert as you know https://github.com/vega/vl-convert/issues/186

Altair development doesn't require any non-Python dependencies, so the advantage of Pixi over uv isn't that strong, but wanted to mention it as an alternative to at least consider.

I would say the advantages you listed are interesting but not sure how they'd benefit altair.

Also from reading the docs, it seems to be missing building & publishing. Not fully sure how that works in altair, but I thought hatch & uv both had these features?

dangotbanned avatar Sep 07 '24 11:09 dangotbanned

Yeah, for building a publishing you still use python -m build and twine. And I have no objection to using uv (or staying with hatch for that matter).

jonmmease avatar Sep 07 '24 12:09 jonmmease

FYI we're already using uv for dependencies.

I'd support switching over fully if there was no loss of functionality.

A first step would be trying to replace pip and hatch in all the GitHub workflows. They now have proper documentation - which was missing when I tried that out before

Related

  • https://github.com/narwhals-dev/narwhals/blob/e9afffd233ed4b4df5364dc8c16ba00e16f86871/.github/workflows/downstream_tests.yml
  • https://github.com/narwhals-dev/narwhals/pull/682#issue-2437742813
  • https://github.com/narwhals-dev/narwhals/issues/955

dangotbanned avatar Sep 10 '24 20:09 dangotbanned

Noting here that switching to uv from pip in build.yml may unblock https://github.com/vega/altair/actions/runs/10861487260/job/30143440337?pr=3591

  • #3591

If I've understood https://github.com/geopandas/pyogrio/issues/450 correctly, installing pyogrio with conda instead of pip avoided the same error

dangotbanned avatar Sep 15 '24 10:09 dangotbanned

I've recently been working with uv some more in https://github.com/vega/vega-datasets:

  • https://github.com/vega/vega-datasets/pull/647
  • https://github.com/vega/vega-datasets/pull/631

Definitely a joy to work with, but there is one gap we'd need to solve before the switch hatch -> uv

  • [ ] https://github.com/astral-sh/uv/issues/5903

Either all of our hatch scripts need to be:

  1. Migrated into python scripts (for cross-platform support)
  2. Migrated to some other task runner
hatch scripts

https://github.com/vega/altair/blob/9002472d65875a8486de920e4cc585420efdd276/pyproject.toml#L118-L122

https://github.com/vega/altair/blob/9002472d65875a8486de920e4cc585420efdd276/pyproject.toml#L131-L143

https://github.com/vega/altair/blob/9002472d65875a8486de920e4cc585420efdd276/pyproject.toml#L145-L162

https://github.com/vega/altair/blob/9002472d65875a8486de920e4cc585420efdd276/pyproject.toml#L167-L196

It makes it easier for new Altair contributors to get started as they might already know it from other projects @binste

I think having all these scripts/commands fragmented between files - rather than centralised in pyproject.toml - might make it a little more difficult to get started.

[!IMPORTANT] This is my only concern. I really want to switch to uv as soon as we can solve this

dangotbanned avatar Dec 17 '24 12:12 dangotbanned