Loading global state in safe mode attempts to remove files
Similarly to #5122, this happens when running opam-monorepo cram test suite from opam.
It seems that even though we initialize opam in safe mode, simply loading opam's global state with Lock_none attempts to remove some cache files and therefore crashes when sandboxed. I would expect it not to attempt any such operation.
Here's an example of such an error we got when trying to run the test suite:
$ opam-monorepo lock --lockfile test.opam.locked --recurse | grep locally
+ opam-monorepo: internal error, uncaught exception:
+ Cannot remove /home/nathan/.opam/repo/state-392B1E43.cache (/home/nathan/code/opam-monorepo/_opam/.opam-switch/build/opam-monorepo.0.2.7/_build/default/bin/opam_monorepo.exe: "unlink" failed on /home/nathan/.opam/repo/state-392B1E43.cache: Read-only file system).
+ Raised at OpamSystem.internal_error.(fun) in file "duniverse/opam/src/core/opamSystem.ml", line 30, characters 4-30
+ Called from Stdlib__List.iter in file "list.ml", line 110, characters 12-15
+ Called from OpamRepositoryState.Cache.save in file "duniverse/opam/src/state/opamRepositoryState.ml", line 61, characters 4-13
+ Called from OpamRepositoryState.load.(fun) in file "duniverse/opam/src/state/opamRepositoryState.ml", line 234, characters 4-17
+ Called from OpamFilename.with_flock_upgrade in file "duniverse/opam/src/core/opamFilename.ml", line 432, characters 14-45
+ Re-raised at OpamStd.Exn.finalise in file "duniverse/opam/src/core/opamStd.ml", line 1372, characters 4-38
+ Called from OpamRepositoryState.with_ in file "duniverse/opam/src/state/opamRepositoryState.ml", line 284, characters 11-23
+ Called from OpamGlobalState.with_ in file "duniverse/opam/src/state/opamGlobalState.ml", line 186, characters 14-18
+ Re-raised at OpamStd.Exn.finalise in file "duniverse/opam/src/core/opamStd.ml", line 1372, characters 4-38
+ Called from Duniverse_cli__Lock.run in file "cli/lock.ml", line 448, characters 4-172
+ Called from Cmdliner_term.app.(fun) in file "duniverse/cmdliner/src/cmdliner_term.ml", line 24, characters 19-24
+ Called from Cmdliner_term.app.(fun) in file "duniverse/cmdliner/src/cmdliner_term.ml", line 22, characters 12-19
+ Called from Cmdliner_eval.run_parser in file "duniverse/cmdliner/src/cmdliner_eval.ml", line 34, characters 37-44
It should be OK for opam to update the cache files, even when the state itself isn't locked (or is read-only). There are a few fixes related to this, for the library:
- The code should be resistant to sandboxing, so if any of the unlinks fail, opam should (semi-)silently abort writing the caches (i.e. it can log that it was unable to do so, but it shouldn't crash)
- Because the lock on the state may not be acquired, the writing of the cache should really be done atomically with a rename
- In safe mode, opam should not attempt to write the caches at all (from a dev meeting discussion!)
- In safe mode, opam should not attempt to write the caches at all (from a dev meeting discussion!)
That last point might actually force us to move away from safe mode which is not what we generally want.
The reason why we want to use safe mode is because we don't want to accidentally modify one's opam configuration or switches but we do want to use the cache and to some extent I guess we do want to populate it as well. Do you think it would make sense to allow such a policy in the libraries? E.g. distinguish cache write operations from other write operations for instance.
This is a somewhat unrelated problem, we do not rely on the cache in our cram tests as we only compute lockfiles there and do not attempt to pull any package sources. What you suggest would indeed fix our test suite.
The last bit though would prevent pull from efficiently using the cache as it would only benefit from it if you already download the sources via a regular opam operation which is unlikely, especially for dune ports for instance.