dream2nix icon indicating copy to clipboard operation
dream2nix copied to clipboard

Consider pex instead of pip for locking

Open phaer opened this issue 2 years ago • 15 comments

We are currently using pip install --dry-install --report in fetchPipMetadata to create our python lock files. While this works better than earlier FOD-based approaches for many use-cases, it has (at least) to major issues:

  • cross-platform locking: We can't currently use that approach to e.g. create aarch64-darwin lock files on x86_64-linux, because we can't easily fake platform environment markers, see https://github.com/pypa/pip/issues/11664#issue-1500189934
  • evaluation of extras & environment markers currently happens at lock-time, not at build-time. While this approach was easier to implement because we could to it in python, it means that
    • the lock file becomes much larger as we need to lock each platform-extra combination ahead of time
    • it can result in incorrect cross-platform locks for legacy sdists which use setup.py. This is because we would need to prepare metadata by executing the setup.py on the build-platform. If the code in setup.py introspects the interpreter to determine the platform that could result in wrong metadata being output.

https://github.com/nix-community/dream2nix/issues/583#issuecomment-1650443589 recommended https://pex.readthedocs.io as an alternative to pip to tackle the first problem (and support cyclic dependencies, as that's what the linked issue was about).

pex lock files include unevaluated enviroment markers,same as pips installation report, i.e. "brotli>=1.0.9; platform_python_implementation == \"CPython\" and extra == \"brotli\"",. So we would need to evaluate those in nix. Prior work for that includes poetry2nix pep425.nix.

What do you think, is that something we should try? Would obsolete my work in https://github.com/phaer/untangled_snakes/ but I am happy about that as long as it improves our python support.

@DavHau @chaoflow

phaer avatar Jul 26 '23 10:07 phaer

pex lock files include unevaluated enviroment markers,same as pips installation report, i.e. "brotli>=1.0.9; platform_python_implementation == "CPython" and extra == "brotli"",. So we would need to evaluate those in nix.

What is the transform you do at a high level? From lock file to ... venv? sys.path entry (flat directory of installed wheels) or something else? Pex has a lot of tools to work with a lock file and those tools evaluate env markers to make the products appropriate for the target (whether a local interpreter of a foreign one described by a complete platform JSON generated on the foreign target with pex3 interpreter inspect --markers --tags).

jsirois avatar Jul 26 '23 16:07 jsirois

Would it still work with the max-date setting that's currently supported by the pip locker? That's very handy!

yajo avatar Jul 27 '23 08:07 yajo

What is the transform you do at a high level? From lock file to ... venv? sys.path entry (flat directory of installed wheels) or something else?

I'd say "something else", but a sys.path entry sounds close; We eventually put each python package in separate derivations. Those derivations can be build "purely", meaning they get all their info from inputs declared in the repo (lock file, nixpkgs, etc) but don't have network access during build-time.

With the current approach, we just evaluated the markers during lock-time and wrote the evaluated/"effective" dependencies into the lock file, see e.g. https://github.com/nix-community/dream2nix/blob/c93ace7b6b4ba54296442efc9ddd8734826d661a/modules/drvs/ansible/lock-x86_64-linux.json#L67 for what that looks like atm. This has the significant drawback that we'd need to lock each extra/platform combination separately, so moving marking evaluation to build-time seems like the right way forward to me.

Poetry2nix re-implements pep425 and pep508 in nix. Another alternative would be to call pex again during build-time (pex shouldn't need network connectivity or so for those commands). The latter has the advantage of probably having a better tested implementation, but the disadvantage of adding pex to our build-closure.

I'll try to write a proof-of-concept for the latter approach next week or so :)

phaer avatar Jul 27 '23 09:07 phaer

Would it still work with the max-date setting that's currently supported by the pip locker? That's very handy!

@DavHau and me recently wondered if there are any real use cases for that, so curios what you are using for? In any case, that feature works by mitm-proxying pypi, so that should be doable with pex as well if needed.

phaer avatar Jul 27 '23 09:07 phaer

what you are using for?

I'm packaging Odoo. It's a mastodon composed of many pieces and tons of dependencies (js, binary, ruby, python...). This beast moves fast in some points and slow in some others. Their approach towards dependencies is "we officially support those supported by current debian stable at the time of launching Odoo". Odoo releases have a lifespan of 3 years.

As you can imagine, the deps diff accumulated between nixpkgs and debian stable in 3 years is abysmal. Besides, it's quite easy that there's a new dependency release that breaks Odoo in some way. One recent example: https://github.com/odoo/odoo/pull/124351 (all CI is ❌ there).

Our derivations include many more modules. Those modules evolve at a different pace and may introduce new dependencies.

As you can imagine, keeping the balance between having a stable and an updated-enough deployment is no easy task.

So, summarizing, all I use this feature for is to be able to tell my devs: "if you add a new addon that has new dependencies, just run nix run .#refresh and commit those locks". And I like the peace of mind that the date capping gives me. That command will only add new stuff for that new module, but will not update any other dependencies that may break some other portion of the system. Then I can set up a renovate action that upgrades that date and refreshes the lock files, to let CI detect incompatibilities if any.

yajo avatar Jul 27 '23 10:07 yajo

I think the max-date setting is very valuable as of now as it allows us to update individual dependencies. If we instead use a third party tool to manage the lock file, then this tool might already come with the ability to only update a single dependency, so we might not need the max-date feature anymore.

DavHau avatar Jul 29 '23 06:07 DavHau

Hello,

I am giving a python module for dream2nix based on pex lock files and pyproject.nix a try - see #611 for details.

  • pex and pex3 lock create don't seem to work when installed via nix. I've opened an issue at https://github.com/NixOS/nixpkgs/issues/246448
  • https://github.com/pantsbuild/pex/issues/2100 would be very useful for us. Would also be open to try and help implement that.

phaer avatar Jul 31 '23 22:07 phaer

If we instead use a third party tool to manage the lock file, then this tool might already come with the ability to only update a single dependency, so we might not need the max-date feature anymore.

This is in fact true for Pex. You can use pex3 lock update -p "just-this<3" lock.json to update the existing lock.json if possible by keeping everything the same but trying to bump the "just-this" project to the maximum compatible version less than 3.

jsirois avatar Aug 03 '23 16:08 jsirois

I've tested it pex3. Indeed the lock file is quite cool! I tested it with this command FWIW:

pex3 lock create --transitive --indent 2 --style universal -o lock.json --resolver-version pip-2020-resolver copier pdm
cat lock.json

It seems like plugging that into dream2nix would be a piece of cake.

However, regarding UX, until now, dream2nix provides the interface to refresh the lock file:

nix run -L .#package.config.lock.refresh

This, together with the mitmproxy, gives a reproducible output based on the date. It means, in practice, that the target package can be packaged no matter with wich framework: poetry, pdm, flit, setuptools, or just a raw requirements.txt file. As long as the dependency is properly fed into dream2nix, it will produce the correct lock file.

I'm not talking only about times when you have to install dependencies, but also times where you use dream2nix to maintain a python project itself. In your example, it's using setuptools. But if I were using poetry, it'd work too.

If we rely on pex3 to maintain the lock file by following the strategy you're saying, then dream2nix users would need to manually keep it up to date, instead of the current process of:

  1. Update your dependency.
  2. Re-lock.

Given the use case I have (https://github.com/nix-community/dream2nix/issues/601#issuecomment-1653338668), I really appreciate the current simplicity of that process.

FWIW this is the requirements.txt file I feed into dream2nix. Pretty please don't make me maintain manually a lock file based on those requirements. 🙏🏼 😓

yajo avatar Sep 14 '23 09:09 yajo

Hello & thank you for the detailed description, but could you clarify the following part:

If we rely on pex3 to maintain the lock file by following the strategy you're saying, then dream2nix users would need to manually keep it up to date, instead of the current process of:

Because i am not sure if I follow here: You'd like to have a mode where you update the lock-file to the newest packages that satisfy the constraints in your pyproject.tom/requirements.txt/setup.py?

If so, wouldn't it work like that if you just set the snaphot date to null?

Or are you talking about the the incremental aspects of https://github.com/nix-community/dream2nix/issues/601#issuecomment-1664271430, where one could re-use the existing lock-file? Agree that this would be a nice feature to have and plan to do so, but it's not implemented yet.

phaer avatar Sep 14 '23 10:09 phaer

You'd like to have a mode where you update the lock-file to the newest packages that satisfy the constraints in your pyproject.tom/requirements.txt/setup.py?

Yes.

If so, wouldn't it work like that if you just set the snaphot date to null?

Well, the difference is that each time you run it, you'd get different results. Right now, OTOH, each time you run it, you get the same results because of the mitmproxy + pypi snapshot date.

Imagine the problem when one dependency fails; then you patch it; then you re-lock but another dependency slips in and breaks differently. I've been there when using poetry and poetry2nix: poetry has the nasty habit of updating dependencies every time it can. It's really frustrating.

So, the current approach on dream2nix is awesome. That's what I meant! ❤️ The only thing it's lacking is multi-arch locking. So, if possible, please fix just that (maybe with pex) and leave the rest of the UX as it is, because it's great!

It's so great that I've stopped using poetry and went back to setuptools on a project because poetry now adds no value. It's so great because I could use hatch, setuptools, or flit, or poetry, or whatever... and the workflow would be exactly the same.

yajo avatar Sep 22 '23 09:09 yajo

TIL about pip-tools... would this help? https://pip-tools.readthedocs.io/en/stable/#using-hashes It seems to produce a hashed lock file from any requirements.txt

yajo avatar Feb 23 '24 10:02 yajo

I think we should just focus on stabilizing the pdm module. It fixes most of the major issues we currently have with pip.

DavHau avatar Feb 25 '24 03:02 DavHau

@yajo the pdm module does allow you to update individual dependencies via the pdm cli and it also has multi platform lock files. Maybe you want to give it a shot? I could use someone thoroughly testing it.

DavHau avatar Feb 25 '24 03:02 DavHau

I'll give it a shot then and report. However I can't promise it'll be soon 😅

yajo avatar Feb 27 '24 07:02 yajo

I still didn't get a chance to test the pdm module. Is that the final answer to this issue?

yajo avatar Apr 18 '24 11:04 yajo

No! Sorry for the confusion, I just noticed that this issue was still open while I had closed the PR https://github.com/nix-community/dream2nix/pull/611 a while ago because I lack time/priority on that.

Happy to re-open if you think thats still useful and/or want to work on it?

phaer avatar Apr 18 '24 11:04 phaer

No, don't worry. I think the issue title is a bit misleading because it focuses on the solution rather than the problem.

The problem is that dream2nix can't build multiarch flakes. If that's solved it's OK closing it AFAIK.

yajo avatar Apr 18 '24 12:04 yajo