Pkg.jl
Pkg.jl copied to clipboard
Ensure `@stdlib` is on `LOAD_PATH` for `.pkg/` hooks
We need to be able to load the standard libraries such as TOML,
Artifacts, etc... when running our .pkg hooks such as
.pkg/select_artifacts.jl. While these are usually available, they are
not available during tests, so let's ensure that @stdlib is always on
the load path.
To trigger the issue, use the following code on Julia v1.8 or later, as
only Julia v1.8+ is compatible with an LLVM_jll new enough to use the
.pkg hooks.
import Pkg
mktempdir() do dir
cd(dir)
Pkg.activate(dir)
Pkg.generate("SomePkg")
mkdir("SomePkg/test")
touch("SomePkg/test/runtests.jl")
@info("About to add()")
Pkg.activate("SomePkg/test") do
Pkg.add("LLVM_jll")
end
@info("About to test()")
Pkg.activate("SomePkg") do
Pkg.test()
end
end
Why would they not be available during the tests? The package has those as dependencies and they should therefore be available in the test project?
The issue is that .pkg/select_artifacts.jl is able to be run at a very inconvenient time; it runs during Pkg.instantiate() (and also at other times, but this time is the important one right now). This means that when we run this Julia code, it really shouldn't be allowed to load any non-stdlib packages, because at the time of the hook running, we could be in the middle of instantiating the very package you are trying to load. To work around this (and not impose an ordering within download_source() we decided to just restrict the package hooks to only be able to load stdlibs. As an example of the kinds of things these hooks do, check out MPITrampoline.jl where we use the top-level preference to allow Pkg.instantiate() (and equivalently, Pkg.add()) to download the correct MPI platform in one shot; you can set the preference first, then Pkg.add("PackageThatUsesMPI") and MPITrampoline will download the binary that is built against that backing MPI engine. Also, check out how it doesn't load Preferences, because we're not sure that Preferences is even available yet, so we restrict ourselves to only using Base + stdlibs.
Now, during Pkg.test() time, this is not an issue unless the package is a test-only dependency (and thus would have the same issue of having some dependencies not yet instantiated). But there's entanglement here across two axes:
- We run the
.pkg/select_artifacts.jlhook the same way, whether duringPkg.instantiate()or after. - We use
gen_build_code(), which is used both forPkg.build()and these hooks, so some of the options (like setting the active project) don't make that much sense when running the hooks.
The reason I bring up Pkg.test() at all is that Pkg.test() runs with a different LOAD_PATH than Pkg.instantiate() does. Previously, we've only ever tested the artifact hooks at Pkg.add() time, but Valentin noticed that they were broken when running with Pkg.test(). As a reproducer, run the above test script with a master build of Julia, and the following patch to Pkg.jl:
diff --git a/src/Operations.jl b/src/Operations.jl
index 079b7bb81..d256a48de 100644
--- a/src/Operations.jl
+++ b/src/Operations.jl
@@ -926,6 +926,7 @@ end
function gen_build_code(build_file::String; inherit_project::Bool = false, add_stdlib::Bool = false)
code = """
$(Base.load_path_setup_code(false))
+ println(stderr, "Base.LOAD_PATH: ", Base.LOAD_PATH)
if $(add_stdlib)
push!(Base.LOAD_PATH, "@stdlib")
end
It should give the following output:
[ Info: About to add()
<snip>
Base.LOAD_PATH: ["@", "@v#.#", "@stdlib"]
<snip>
[ Info: About to test()
<snip>
Base.LOAD_PATH: ["@", "/tmp/jl_7Xtkd3"]
(Note that I applied this patch on top of this PR, so that the commands actually succeed)
To work around this (and not impose an ordering within download_source() we decided to just restrict the package hooks to only be able to load stdlibs
From what I recall we decided to only allow stuff from Base?
Edit: https://github.com/JuliaLang/Pkg.jl/pull/2024#issuecomment-760534026 I guess I misremembered. Regardless, it is a bit "scary" that this uses packages without declaring a dependency.
Regardless, it is a bit "scary" that this uses packages without declaring a dependency.
Alternatively, we could completely change this so that the hooks explicitly run with the environment of Pkg itself? I don't know if that's really feasible, but it would perhaps be more philosophically consistent?
we decided to just restrict the package hooks to only be able to load stdlibs.
That feels kind of arbitrary since the set of stdlibs is changing (https://github.com/JuliaLang/julia/pull/45540). If someone uses DelimitedFiles in their pkg hook, it will be broken after that PR?
Yes, but the expectation is that nobody is going to use anything other than TOML, Artifacts and Libdl, for example. If we wanted to restrict it to just those three, how would we do that?
We could just document that you are only allowed to use those three inside a .pkg/ hook, and then if someone's code breaks because they violate that, it's on them.
Can we also document explicitly somewhere that people aren't allowed to e.g. mutate package source code inside a .pkg/ hook?
Isn't this already enforced by setting package source to read-only after checkout? I don't think this is specific to .pkg/ hooks, I think this is true of any part of a package, no?
I don't think this is specific to
.pkg/hooks, I think this is true of any part of a package, no?
Ah fair, we do want this to be a rule for all packages. We should still document it somewhere, just not in the place where .pkg/ hooks are located
Isn't this already enforced by setting package source to read-only after checkout? I don't think this is specific to
.pkg/hooks, I think this is true of any part of a package, no?
I think you can still get around this, right? E.g. if you are root (a common situation for people that run their CI inside Docker containers where the USER wasn't set to a non-root user), I'm pretty sure you can still modify the package source, even if the files are set to read-only.
I think you can still get around this, right? E.g. if you are root (a common situation for people that run their CI inside Docker containers where the USER wasn't set to a non-root user), I'm pretty sure you can still modify the package source, even if the files are set to read-only.
If someone's method of getting around filesystem restrictions is to run Pkg as root, I feel putting a warning in a manual is not going to change their behavior at all. All we can do is enforce rules for 99% of the users, I don't think we need to worry about such exceptional cases as running as root.
I realize now that the whole discussion is moot because we already have this in the docs:
Packages should avoid mutating their own state (writing to files within their package directory).
We do still need to document somewhere that .pkg/ hooks are only allowed to use TOML, Artifacts, and TOML.
#3168
Also the docs currently say that .pkg/ hooks are "experimental", so users should expect that their hooks might break.