Pkg.jl icon indicating copy to clipboard operation
Pkg.jl copied to clipboard

Ensure `@stdlib` is on `LOAD_PATH` for `.pkg/` hooks

Open staticfloat opened this issue 3 years ago • 12 comments

We need to be able to load the standard libraries such as TOML, Artifacts, etc... when running our .pkg hooks such as .pkg/select_artifacts.jl. While these are usually available, they are not available during tests, so let's ensure that @stdlib is always on the load path.

To trigger the issue, use the following code on Julia v1.8 or later, as only Julia v1.8+ is compatible with an LLVM_jll new enough to use the .pkg hooks.

import Pkg

mktempdir() do dir
    cd(dir)
    Pkg.activate(dir)
    Pkg.generate("SomePkg")
    mkdir("SomePkg/test")
    touch("SomePkg/test/runtests.jl")
    @info("About to add()")
    Pkg.activate("SomePkg/test") do
        Pkg.add("LLVM_jll")
    end
    @info("About to test()")
    Pkg.activate("SomePkg") do
        Pkg.test()
    end
end

staticfloat avatar Aug 09 '22 05:08 staticfloat

Why would they not be available during the tests? The package has those as dependencies and they should therefore be available in the test project?

KristofferC avatar Aug 09 '22 11:08 KristofferC

The issue is that .pkg/select_artifacts.jl is able to be run at a very inconvenient time; it runs during Pkg.instantiate() (and also at other times, but this time is the important one right now). This means that when we run this Julia code, it really shouldn't be allowed to load any non-stdlib packages, because at the time of the hook running, we could be in the middle of instantiating the very package you are trying to load. To work around this (and not impose an ordering within download_source() we decided to just restrict the package hooks to only be able to load stdlibs. As an example of the kinds of things these hooks do, check out MPITrampoline.jl where we use the top-level preference to allow Pkg.instantiate() (and equivalently, Pkg.add()) to download the correct MPI platform in one shot; you can set the preference first, then Pkg.add("PackageThatUsesMPI") and MPITrampoline will download the binary that is built against that backing MPI engine. Also, check out how it doesn't load Preferences, because we're not sure that Preferences is even available yet, so we restrict ourselves to only using Base + stdlibs.

Now, during Pkg.test() time, this is not an issue unless the package is a test-only dependency (and thus would have the same issue of having some dependencies not yet instantiated). But there's entanglement here across two axes:

  • We run the .pkg/select_artifacts.jl hook the same way, whether during Pkg.instantiate() or after.
  • We use gen_build_code(), which is used both for Pkg.build() and these hooks, so some of the options (like setting the active project) don't make that much sense when running the hooks.

The reason I bring up Pkg.test() at all is that Pkg.test() runs with a different LOAD_PATH than Pkg.instantiate() does. Previously, we've only ever tested the artifact hooks at Pkg.add() time, but Valentin noticed that they were broken when running with Pkg.test(). As a reproducer, run the above test script with a master build of Julia, and the following patch to Pkg.jl:

diff --git a/src/Operations.jl b/src/Operations.jl
index 079b7bb81..d256a48de 100644
--- a/src/Operations.jl
+++ b/src/Operations.jl
@@ -926,6 +926,7 @@ end
 function gen_build_code(build_file::String; inherit_project::Bool = false, add_stdlib::Bool = false)
     code = """
         $(Base.load_path_setup_code(false))
+        println(stderr, "Base.LOAD_PATH: ", Base.LOAD_PATH)
         if $(add_stdlib)
             push!(Base.LOAD_PATH, "@stdlib")
         end

It should give the following output:

[ Info: About to add()
<snip>
Base.LOAD_PATH: ["@", "@v#.#", "@stdlib"]
<snip>
[ Info: About to test()
<snip>
Base.LOAD_PATH: ["@", "/tmp/jl_7Xtkd3"]

(Note that I applied this patch on top of this PR, so that the commands actually succeed)

staticfloat avatar Aug 09 '22 17:08 staticfloat

To work around this (and not impose an ordering within download_source() we decided to just restrict the package hooks to only be able to load stdlibs

From what I recall we decided to only allow stuff from Base?

Edit: https://github.com/JuliaLang/Pkg.jl/pull/2024#issuecomment-760534026 I guess I misremembered. Regardless, it is a bit "scary" that this uses packages without declaring a dependency.

fredrikekre avatar Aug 09 '22 21:08 fredrikekre

Regardless, it is a bit "scary" that this uses packages without declaring a dependency.

Alternatively, we could completely change this so that the hooks explicitly run with the environment of Pkg itself? I don't know if that's really feasible, but it would perhaps be more philosophically consistent?

staticfloat avatar Aug 10 '22 00:08 staticfloat

we decided to just restrict the package hooks to only be able to load stdlibs.

That feels kind of arbitrary since the set of stdlibs is changing (https://github.com/JuliaLang/julia/pull/45540). If someone uses DelimitedFiles in their pkg hook, it will be broken after that PR?

KristofferC avatar Aug 10 '22 21:08 KristofferC

Yes, but the expectation is that nobody is going to use anything other than TOML, Artifacts and Libdl, for example. If we wanted to restrict it to just those three, how would we do that?

staticfloat avatar Aug 10 '22 21:08 staticfloat

We could just document that you are only allowed to use those three inside a .pkg/ hook, and then if someone's code breaks because they violate that, it's on them.

DilumAluthge avatar Aug 10 '22 21:08 DilumAluthge

Can we also document explicitly somewhere that people aren't allowed to e.g. mutate package source code inside a .pkg/ hook?

Isn't this already enforced by setting package source to read-only after checkout? I don't think this is specific to .pkg/ hooks, I think this is true of any part of a package, no?

staticfloat avatar Aug 10 '22 21:08 staticfloat

I don't think this is specific to .pkg/ hooks, I think this is true of any part of a package, no?

Ah fair, we do want this to be a rule for all packages. We should still document it somewhere, just not in the place where .pkg/ hooks are located

Isn't this already enforced by setting package source to read-only after checkout? I don't think this is specific to .pkg/ hooks, I think this is true of any part of a package, no?

I think you can still get around this, right? E.g. if you are root (a common situation for people that run their CI inside Docker containers where the USER wasn't set to a non-root user), I'm pretty sure you can still modify the package source, even if the files are set to read-only.

DilumAluthge avatar Aug 10 '22 21:08 DilumAluthge

I think you can still get around this, right? E.g. if you are root (a common situation for people that run their CI inside Docker containers where the USER wasn't set to a non-root user), I'm pretty sure you can still modify the package source, even if the files are set to read-only.

If someone's method of getting around filesystem restrictions is to run Pkg as root, I feel putting a warning in a manual is not going to change their behavior at all. All we can do is enforce rules for 99% of the users, I don't think we need to worry about such exceptional cases as running as root.

staticfloat avatar Aug 10 '22 22:08 staticfloat

I realize now that the whole discussion is moot because we already have this in the docs:

Packages should avoid mutating their own state (writing to files within their package directory).

DilumAluthge avatar Aug 10 '22 23:08 DilumAluthge

We do still need to document somewhere that .pkg/ hooks are only allowed to use TOML, Artifacts, and TOML.

#3168

Also the docs currently say that .pkg/ hooks are "experimental", so users should expect that their hooks might break.

DilumAluthge avatar Aug 10 '22 23:08 DilumAluthge