pixi
pixi copied to clipboard
Abillity to split pixi.toml into multiple files
Problem description
We have a large, and growing pixi.toml file
It has grown to the point where some kind of hierarchical file-based organization would be helpful.
There are clearly different ways to partition: by function such as task vs dependency, by environment, by project sub-directory, including dependencies and tasks related to a given part of the code-base.
However, at a minimum just a way to include other files so create a pixi.toml that is a concatenation of all the included files might help bring a little bit of organization to things.
Interesting. I really appreciate the possibility to keep all my env-related stuff in one file instead of having multiple environment.yml's like I did with conda before. Moreover, possibility of bundling Pixi into pyproject.yml is also nice for python-only projects.
Would be interesting to see work on such a large project
Yes, lovely stuff, and could potentially help reduce duplication in matrix use cases, and enable monorepos and skeleton-based tools to get spun up on pixi
more quickly.
As long as there is a way to get a single inspectable file back out, fully realized, things are pretty ok. Otherwise, behold the nightmare that is GitHub actions uses
or, worse still, the GitLab pipeline spaghetti: can't know until you try, and have already spun up 20 containers!
concatenation
Yep, TOML is quite resilient to cat
. I've explored not actually checking in pixi.toml
and having a wrapper script which doesn't need pixi
... cleaned up a bit, my "run"-focused pxr
:
#!/usr/bin/env bash
cat pixi.base.toml src/*/pixi.partial.toml > pixi.toml
set -eux
for task in "$@"; do
pixi run $task
done
Writing those files can be a bit hard to reason about, however, as all those paths need to be pixi.toml
-relative.
include
As with the inputs
/outputs
feature, task
could provide some inspiration with its top-level includes
which can be a simple path string or a rich object.
Typographically, this could be adaptable here, assuming a reserved namespace delimiter, e.g. :
:
[project]
includes.docs = "docs/pixi.toml"
includes.atest = { manifest = "atest/pixi.toml", features = ["deps-atest"] }
[project.includes.ff]
manifest = "atest/pixi.toml"
features = ["tasks-atest"]
env = { BROWSER = "firefox" }
[project.includes.cr]
manifest = "atest/pixi.toml"
features = ["tasks-atest"]
env = { BROWSER="chromium" }
[tasks.all]
depends_on = ["docs:build", "ff:robot", "cr:robot"]
inputs = [
"build/docs/.buildinfo",
"build/reports/robot/chromium",
"build/reports/robot/firefox",
]
[environments]
docs = ["docs:deps-docs", "docs:tasks-docs"]
atest = ["atest:deps-atest", "ff:tasks-atest", "cr:tasks-atest"]
Dangerously close to going down the spaghetti route of "post-install hooks": if an include
could be declared as coming from an environment, all of a sudden partial pixi
workflows could be packaged like proper software:
[project.includes.ex]
manifest = { environment = "default", path = "share/pixi/tasks/example.com/pixi.toml" }
env = { EX_PROJECT_NAME="my-project" }
[feature.baseline.dependencies]
example-com-pixi-tasks = "2024.04.*"
Having been down this road, it gets pretty hairy if a child wants to reference its parent files, features, tasks, etc. This could either be banned (which would limit the overall functionality) or would require some more syntax to make an includable partial contract more explicit and self-documenting, e.g.
[partial]
parent = "root"
# just use JSON Schema: don't get fancy
[partial.env.EX_PROJECT_REPO]
description = "the location of repo"
[partial.env.EX_PROJECT_NAME]
description = "a lovely project name according to your friendly example.com company policies"
pattern = "[a-z][a-z/d\\-]+"
Looking at Rerun's pixi.toml
, this would definitely benefit from a split.
Since the pixi project started we've thought about a "Workspace" feature. Which would allow you to have multiple manifest files which can be included in the main manifest file. Taken huge inspiration from https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html and https://docs.npmjs.com/cli/v7/using-npm/workspaces Although we don't have a design for this laying around the question would be if that fixes the issue for you? If you think it will I would like to do a full write-up.
Quick summary on the workspace feature:
- Having member packages in subfolders
- Having a member environments specifically for the member but shared dependencies.
- Running tasks from the members from the root entrypoint.
- Depending on members as actual packages which can be shared and cached during development. (requires
pixi build
)
I know we're being credited for not splitting our configuration into multiple files which is something I take seriously as well.
With a SAT hammer sitting around, it's nice if everything is a package nail, but needing build a
as an extra, hidden depends_on
of install:b
to be used by run c
would feel pretty heavy for something that is a another aspect of a project (a la docs, heavyweight tests). Further, it seems that anything touching an env (and presumably the pixi.lock
) has the potential to degrade CI cacheability until something can generate bit-for-bit fully-reproducible .conda
artifacts (which i have yet to see).
Although we don't have a design for this laying around the question would be if that fixes the issue for you? If you think it will I would like to do a full write-up.
Yeah, workspaces very much sounds like a direction that would work well for us. This one definitely isn't urgent at all but came up when I was talking to Tim and Bas last week so I opened an issue to keep from forgetting about the idea.
I would personally not be in favor of making a split manifest functionality without additional features(like workspaces) that can benefit from this as it would add additional complexity to both the pixi code base and when misusing this, also to the projects them self.