cabal feature: Private Dependencies

Tracking ticket for private dependencies: #4035

See the commit message of the commit with the same name as the pull request for the long explanation.

TODO checklist:

[x] ~~Fail cabal-check with private dependencies (we need to ensure it all works with hackage before allowing packages with private dependencies to be uploaded~~
[ ] Any changes that could be relevant to users have been recorded in the changelog.
[x] The documentation has been updated, if necessary.
[x] Stress testing private dependencies

Template Α: This PR modifies cabal behaviour

Include the following checklist in your PR:

[x] Patches conform to the coding conventions.
[x] Tests have been added. (Ask for help if you don’t know how to write them! Ask for an exemption if tests are too complex for too little coverage!)

Feb 26 '24 15:02 alt-romes

The testsuite is fully passing on almost all platforms, but fails when using GHC 9.0.2 because of an RTS assertion failure.

Apr 03 '24 15:04 alt-romes

I'm looking into the assertion failure.

Apr 03 '24 16:04 alt-romes

@grayjay I am trying to write a stress test for private dependencies. I would appreciate your advice here, if you have any.

My idea is to generate a long chain of dependencies with some and without any private scopes, and some how compare the two. The problem is I'm not entirely sure how I would compare them. Tree size doesn't sound too meaningful, and counting time is also not fantastic (how would I do that in the solver testsuite?).

Do you have any ideas?

Apr 08 '24 11:04 alt-romes

@alt-romes I have some ideas for testing performance, though I'm not sure how to include it in the existing test suite. It would be too time consuming to run in CI. I think that it is fine to write a performance test that needs to be run manually and then run it once before merging. I also think that a custom performance test for private dependencies is most important before the feature is merged and released. After lower level libraries start using the feature, it will be easier to test with real packages with the solver Hackage benchmark.

I can think of two main cases to test:

A large dependency problem with no backtracking

I think that this is similar to what you described. It would have a lot of dependencies and a lot of different scopes. Ideally, it would also be solvable without private dependencies, so that the two versions of cabal could be compared. I think that the best way to compare would be by time and memory usage. Counting steps is not very useful, because the length of a step can vary significantly.

A dependency problem with a lot of conflicts and backtracking

This test would contain both version conflicts and conflicts in the validation of private dependencies. It would ensure that the solver is able to backtrack effectively and avoid spending a lot of time retrying similar choices related to private dependencies. Unfortunately, I can't think a good way to compare the run time with the old version of cabal.

Another use for this test is checking for space leaks, though it may not add anything over the existing memory usage tests. It may be necessary to turn off some optimizations, such as --fine-grained-conflicts, to force the solver to run for several minutes in order to detect small space leaks.

Apr 09 '24 22:04 sebright

@grayjay Reporting back on the stress testing:

TLDR: stress testing discovered the closure property check was really inefficient[2], and addressing the slowness by improving the algorithm from scratch made the check's performance impact negligible, regardless of whether the feature is used. Stress testing now suggests private dependencies have no significant impact in the solver's performance

I added the two tests as you suggested. The first type of test (A) is easy to set up and I can run it both with and without private dependencies. The second type of test (B) proved to be much harder: I don't know a reasonable way to procedurally generate a network of packages that cause of a lot of backtracking. It seems that (B) is useless.

Let's talk about stress testing private dependencies with (A), because it proved very useful.

The test type (A) takes a configuration parameter X which means: add X dependencies to the main package, and, for every of those X dependencies add X-1 dependencies, and keep recursing with X-1 until you stop adding dependencies.
- For A(6) the test will generate 1000 unique packages, with a max dependency depth of 6.
- For the private variant of the test (A2), every other dependency is introduced in a unique private scope, instead of at the top level.

The results for A(6):

The first results discovered that checking the private closure property holds was incredibly slow.
For the case without any private dependencies at all (A), solving regressed from 11 seconds to 50s.
For the case with private dependencies (A2), solving took over 6 minutes!! A small algorithm tweak brought it down to 2minutes.
In the first case, the closure property check took over 70% of the time, and in the second case it took over 90%!

It was great that stress testing uncovered this. But the results were actually terrible, in regressing both the case with and without the feature.

After a few iterations me and @sheaf redesigned the private-scope-closure-checking algorithm:

The case without any private dependencies is now back to 11s (baseline), and checking the property takes less than 100ms.
The case with many private dependencies and different scopes now also takes 11s (no private deps baseline), with the check being even faster than when there are no private dependencies [1] taking less than 50ms.

[1] because we don't traverse as far up in the reverse dependency graph because there are much more private scopes

[2] despite already being run at every node at the time, not after everything was solved

Apr 24 '24 16:04 alt-romes

@grayjay this branch is in a state ready for your review. Thanks in advance.

Apr 30 '24 07:04 alt-romes

@alt-romes Thanks for testing the case without backtracking. That sounds like a very useful test!

I still think that it is important to test the case with backtracking. I meant to say that even though we can't compare the results to master, we can still get a lot of information. We can examine the -v3 log to see if cabal is trying similar choices over and over again. We can also see if there are common cases where cabal runs for multiple minutes or appears to not terminate. It is too easy for a new feature to cause the run time to become exponential in the number of conflicting packages.

I have an idea for a test, if it is too difficult to procedurally generate dependencies with realistic conflicts. The test would involve temporarily adding code to cabal to modify the source package index that is passed to the solver. The code would randomly (but deterministically) add private dependencies to all versions of x% of the packages. The private dependencies could be chosen from the package's existing dependencies and maybe a small set of low level packages, in order to avoid circular dependencies. Ideally, the new dependencies would be similar to the types of dependencies that authors will add once the feature is released. Then the solver Hackage benchmark could be used to find examples of packages that are difficult to solve for with the modified cabal.

The main downside of this approach is that I can't think of a way to merge it. However, I think that it would be useful to have it on a separate branch (a separate pull request that isn't meant to be merged) so that anyone could check it out and try it before private dependencies is merged.

I forgot to mention that I also think that it would be really useful to test the correctness of the backtracking by adding private dependencies to the solver QuickCheck tests.

Apr 30 '24 08:04 sebright

@alt-romes I had a question about the design before I start reviewing the code. How similar is this PR to the design described in this document? https://github.com/haskell/cabal/issues/4035#issuecomment-1885135923

If I understood the first commit message correctly, this PR already differs by not automatically pulling intermediate dependencies into private scopes.

May 01 '24 07:05 sebright

@grayjay In general, the implementation/final version should be very similar to the original draft design.

You point out that we do not implicitly qualify packages into the private scope: that's right. I suppose in the original draft we would.

May 01 '24 08:05 alt-romes

@grayjay I have attempted to do the backtracking test by changing Cabal to "randomly" qualify certain packages privately instead of at the top-level. In principle, this should always be possible because having scopes with singleton private dependencies should only make more plans valid, not the other way around.

Unfortunately, I wasn't able to get a fully working example of the hackage-benchmarks test using this strategy. I got the modified version of Cabal to finish in roughly the same time as the non-modified version, but the modified Cabal produced "Unknown" final results rather than "OK".

Given time constraints, I haven't pursued this second benchmark any further. I still believe it's important, but given said time constraints and the robust results of the first stress test I'm of the opinion we can proceed with merging this PR before such test is concluded.

I recall that the first stress test, after iteration, showed negligible performance impact on solving for a plan with thousands of dependencies roughly half of which were private. Adding to this the fact that private dependencies are introduced as "shallow" goals (ie not transitive), I'm reasonably confident that the performance impact even with backtracking should be negligible, and that we could merge this as is before having the results of the second test which we can address with time, without worrying about this PR bit rotting in the meanwhile. I can open a follow-up ticket to track stress-testing private dependencies with backtracking further.

If you don't oppose to this omission, I would greatly appreciate your review to move this PR forward diligently.

May 22 '24 16:05 alt-romes

@alt-romes Sorry for the delay. Thank you for working on the backtracking stress test. It sounds promising that the runtimes were comparable to master. I'm not sure I understand what you meant by "unknown" test results. Is that error specific to the benchmark, or was it an error given by the cabal executable under test? Did it occur in all runs, or only when cabal failed to find a solution? Even if the benchmark doesn't complete, are you able to capture any -v3 logs that show how cabal handles the new types of conflicts added by the PR?

I still think that it is important to analyze the performance of private dependencies when there are conflicts, in some way, before merging. The first stress test mainly evaluated the overhead added to each step of dependency solving, but this test mainly evaluates the total number of steps, which is almost orthogonal. It is easy for a change to add almost no overhead to the individual steps but add an exponential number of steps to resolve conflicts, for example, by adding too many variables to the conflict set. It is very difficult to address some types of performance issues once a feature is released and being used by packages on Hackage.

In principle, this should always be possible because having scopes with singleton private dependencies should only make more plans valid, not the other way around.

Is it possible to combine some of the dependencies into the same scope so that they aren't all singletons? I'm concerned that singleton scopes won't exercise all of the conflict types, such as enforcement of the scope closure property. The best test cases would involve many realistic conflicts, of various types, and significant amounts of backtracking, with some cases having solutions and others having no solution due to private dependencies. It's fine for packages to go from succeeding to failing and vice versa, compared to master, since these changes already can't be directly compared to master.

Jun 03 '24 19:06 sebright

@grayjay the details of my attempt are slightly fuzzy, but they were, in the end, inconclusive about the performance of the solver with private dependencies when backtracking is involved.

I was wondering, given your expertise with the solver and recent suggestions, if you could independently attempt to benchmark this PR on your machine with the backtracking relevant machinery. I wasn't able to get the benchmark working properly in my backtracking attempt, and currently cannot afford to spend a lot of time figuring out what was wrong, but perhaps you may be able to get a working solution without trouble since you've worked with these benchmarks in the past.

Jun 12 '24 10:06 alt-romes

@alt-romes I don't have very much time to spend on Cabal currently, but I could try running your benchmark to see if I have any ideas about how to proceed, especially debugging the "unknown" result. Do you have a link to the branch?

Jun 12 '24 16:06 sebright

@alt-romes I don't have very much time to spend on Cabal currently, but I could try running your benchmark to see if I have any ideas about how to proceed, especially debugging the "unknown" result. Do you have a link to the branch?

That would be great! Here's a commit with the change: https://github.com/alt-romes/cabal/tree/wip/romes/private-deps-bench

Jun 13 '24 07:06 alt-romes

@alt-romes I tried running the Hackage benchmark manually on your branch, and I was able to reproduce the "Unknown" error. It indicates that the cabal under test failed but the benchmark didn't recognize the error message. Then I manually ran the modified cabal on one of the packages producing the error. Here is an example error from a lower level package:

$ cabal --ignore-project install --dry-run --lib primitive
Warning: this is a debug build of cabal-install with assertions enabled.
Resolving dependencies...
internal error: could not construct a valid install plan.
The proposed (invalid) plan contained the following problems:
Package primitive-0.9.0.0 has an invalid configuration, in particular:
  the package has a dependency base >=4.9 && <4.21 but no package has been selected to satisfy it.
  the package has a dependency template-haskell >=2.11 but no package has been selected to satisfy it.
  the package has a dependency transformers >=0.5 && <0.7 but no package has been selected to satisfy it.
  the package configuration specifies PRIVATE-base.base-4.18.0.0 but (with the given flag assignment) the package does not actually depend on any version of that package.
  the package configuration specifies PRIVATE-template-haskell.template-haskell-2.20.0.0 but (with the given flag assignment) the package does not actually depend on any version of that package.
  the package configuration specifies PRIVATE-transformers.transformers-0.6.1.0 but (with the given flag assignment) the package does not actually depend on any version of that package.

Proposed plan:
PreExisting rts-1.0.2 (rts-1.0.2)
PreExisting ghc-prim-0.10.0 (ghc-prim-0.10.0)
PreExisting ghc-bignum-1.3 (ghc-bignum-1.3)
PreExisting base-4.18.0.0 (base-4.18.0.0)
PreExisting ghc-boot-th-9.6.2 (ghc-boot-th-9.6.2)
PreExisting transformers-0.6.1.0 (transformers-0.6.1.0)
PreExisting array-0.5.5.0 (array-0.5.5.0)
PreExisting deepseq-1.4.8.1 (deepseq-1.4.8.1)
PreExisting pretty-1.1.3.6 (pretty-1.1.3.6)
PreExisting template-haskell-2.20.0.0 (template-haskell-2.20.0.0)
Configured primitive-0.9.0.0 lib

CallStack (from HasCallStack):
  error, called at src/Distribution/Client/Dependency.hs:903:17 in cabal-install-3.13.0.0-inplace:Distribution.Client.Dependency

It looks like the latest commit modifies qualifiers in the middle of solving, which causes an inconsistency. I think it could be fixed by only modifying packages at the start of solving, such as during index conversion, in convPIs.

Jun 15 '24 07:06 sebright

@grayjay thanks for investigating.

Unfortunately, convPIs doesn't quite work because we don't do any kind of qualification there it seems, so I don't have a way to change the qualifier there and in the adjacent functions.

Jun 19 '24 15:06 alt-romes

We discussed code review at the last meeting Cabal meeting. I'd like to focus on the solver part of this pull request. If I understand correctly, @fgaz is interested in reviewing the changes to the Cabal library. I think that we should also have another reviewer for the high level design and user visible behavior, and others at the meeting suggested @gbaz. @gbaz are you interested in being a reviewer?

We also discussed making this feature experimental, so that Hackage rejects packages with private dependencies, until we've finished testing the performance, especially performance in the presence of conflicts.

Jul 04 '24 08:07 sebright

Unfortunately, convPIs doesn't quite work because we don't do any kind of qualification there it seems, so I don't have a way to change the qualifier there and in the adjacent functions.

I meant that you could directly insert private dependencies during index conversion, as if they were read from the GenericPackageDescription. For example, you could insert them here: https://github.com/haskell/cabal/blob/b7ff4a777a4e615c28c5ded06dc5ee19f7fe0404/cabal-install-solver/src/Distribution/Solver/Modular/IndexConversion.hs#L344-L346

I would expect the solver to add the correct qualifier later.

Jul 04 '24 08:07 sebright