julia icon indicating copy to clipboard operation
julia copied to clipboard

Extension cycles are problematic (long `precompile` times, difficult to understand/resolve)

Open topolarity opened this issue 1 year ago • 4 comments

The attached test package imports many packages, including ModelingToolkit which currently has a circular dependency among its extensions. In a case with circular deps like this, pkg> precompile seems to be much less successful on 1.10.5 (which back-ported circularity detection) vs. 1.10.4

A locally-built v1.10.5 shows the warning regarding the circularity:

(Circularity_MWE) pkg> precompile
┌ Warning: Circular dependency detected. Precompilation will be skipped for:
│   ModelingToolkitStandardLibrary [16a59e39-deab-5bd0-87e4-056b12336739]
│   NonlinearSolve [8913a72c-1f9b-4ce2-8d82-65094dcecaec]
...
│   BoundaryValueDiffEq [764a87c0-6b3e-53db-9096-fe964310641d]
│   ManifoldsBoundaryValueDiffEqExt [eb713886-0b96-5c41-a09d-5e8967e02f7d]
└ @ Pkg.API ~/repos/julia/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:1279
...
  342 dependencies successfully precompiled in 294 seconds. 31 skipped due to circular dependency.

The using Circularity_MWE message is quite scary and cryptic:

┌ Warning: Module SymbolicsPreallocationToolsExt with build ID ffffffff-ffff-ffff-0000-6f7fcf83bfa3 is missing from the cache.
│ This may mean SymbolicsPreallocationToolsExt [d479e226-fb54-5ebe-a75e-a7af7f39127f] does not support precompilation but is imported by a module that does.
└ @ Base loading.jl:1948
┌ Error: Error during loading of extension SymbolicsPreallocationToolsExt of Symbolics, use `Base.retry_load_extensions()` to retry.
│   exception =
│    1-element ExceptionStack:
│    Declaring __precompile__(false) is not allowed in files that are being precompiled.
│    Stacktrace:
│      [1] _require(pkg::Base.PkgId, env::Nothing)
│        @ Base ./loading.jl:1952
... (many stack frames skipped)

The real problem though, is it takes about 25 minutes for the first using X to complete on my machine:

$ time julia --project=Circularity_MWE -e "using Circularity_MWE"
 2076.34s user 31.37s system 139% cpu 24:53.46 total

On v1.10.4, this takes ~10 seconds.

MWE: Circularity_MWE.tar.gz

Related: #55543

topolarity avatar Aug 21 '24 20:08 topolarity

After discussing this with @KristofferC, I think we're both convinced that these "cycles" need to be forbidden outright. The basic problem is that it's possible for two extensions Ext1 and Ext2 to look at their dependencies and for each to declare "The triggers for ExtX (the other extension) to load are loaded, so therefore ExtX should be available."

The simplest case is that packages A and B both create an extension for each other - ABExt and BAExt both expect the other to be available when pre-compiling in that case. However, that's clearly inconsistent: ABExt cannot expect to be loaded before BAExt if BAExt also expects to be loaded before ABExt.

The answer is that extensions cannot just "look at their dependencies" and expect the corresponding set of extensions to be available - We have to reduce that set somehow

topolarity avatar Aug 22 '24 21:08 topolarity

Kristoffer tried a fix to this in https://github.com/JuliaLang/julia/pull/48674, which made it so that pre-compilation just does not load any extensions at all. However, that ended up reverted as part of https://github.com/JuliaLang/julia/issues/49250 because it means that important functionality (in extensions) can end up unavailable during pre-compilation.

I think it might be sufficient to use a much more specific exception:

An extension being pre-compiled should only load extensions for the individual packages it imports (using Foo).

In particular, using A, B in an extension should only load the extensions that would have been loaded by using A or using B alone.

The idea here is that extensions can't depend on directly triggering other extensions, which prevents cycles, but indirectly triggering them is fine - That way using ModelingToolkit has access to the extensions ModelingToolkit "always" has access to.

There is one corner case... If the extension you're compiling has all of its triggers in its parent's dependencies (i.e. it always loads immediately after its parent, not very extension-like hah) then we need special handling. These "single-trigger" extensions cannot be loaded for other single-trigger extensions with the same parent (they share the same direct trigger → cyclical)

topolarity avatar Aug 22 '24 22:08 topolarity

I've recently noticed that this warning is triggered even for packages with no cycle:

julia> using Revise
┌ Warning: Circular dependency detected. Precompilation will be skipped for:
│   Base.PkgId(Base.UUID("16a59e39-deab-5bd0-87e4-056b12336739"), "ModelingToolkitStandardLibrary")
│   ...
│   Base.PkgId(Base.UUID("eb713886-0b96-5c41-a09d-5e8967e02f7d"), "ManifoldsBoundaryValueDiffEqExt")
└ @ Base.Precompilation precompilation.jl:560
[ Info: Precompiling Revise [295af30f-e4ad-537b-8983-00126c2a3abe] (cache misses: wrong julia version (2))

I never loaded ModelingToolkit... But I pay a huge load time penalty for its cycle.

topolarity avatar Aug 23 '24 21:08 topolarity

I think https://github.com/JuliaLang/julia/pull/53957 should solve that.

KristofferC avatar Aug 24 '24 06:08 KristofferC

Latest proposal is quite a bit more permissive (allows for more ext -> ext dependencies):

Load ext A for ext B if the unique triggers of A are all reachable from the unique triggers of B

This subsumes the rule above, and also would allow for dependencies like:

  • Ext1 depends on {A, C}
  • Ext2 depends on {B, C}
  • B depends on A

In that example, this rule would add an implicit dependency between the extensions:

  • Ext2 depends on Ext1

which was the situation in https://github.com/JuliaLang/julia/pull/55589#issuecomment-2330466773.

The rule also still works even if these are "single-trigger" extensions (e.g., if C depends on B).

topolarity avatar Sep 05 '24 14:09 topolarity

I don't like how a random package adding a dependency (without even loading it) can cause code to behave differently.

KristofferC avatar Sep 05 '24 15:09 KristofferC

What's the problematic case you have in mind?

Whenever a dependency edge B->A affects whether we "infer" an edge from Ext1->Ext2 it's because A and B were actually loaded (since they are the triggers for your extensions).

edit: Ah here's the problem: Take the example above where B depends on A, but let's say B does not actually load A. Ext1 does not trigger in that case, so we wouldn't force it to load either (even though it's ordered by the rule above). If A is loaded later, we already loaded Ext2 without Ext1.

So we prevented the cycle, but didn't provide the right trigger - Yeah not the rule we want.

topolarity avatar Sep 05 '24 20:09 topolarity

Just to summarize the fixes here:

  • Extensions do not have "ambient" rights to load other extensions during pre-compile (removed in #55589)
  • Extensions do load other extensions that their packages had a right to load

At some point in the future, we may want to allow direct ext → ext loading rights. Either the "subset rule" on explicit triggers seems like a sound way to do that (i.e. triggers(ext1) ⊊ triggers(ext2)) or requiring explicit ext -> ext edges works as well.

The important piece for any solution there is that it needs to "break the symmetry" and make explicit whether ExtAB depends on ExtBA, or vice-versa.

topolarity avatar Sep 30 '24 13:09 topolarity