build
build copied to clipboard
AnalyzerResolver calls processAsset O(n^2) times for project with n source files
Although my investigation based on build_runner 0.6.1+1, which uses code_transformers 0.5.1+4, It seems that all heavy code is the same for build_resolvers 0.1.1 For each asset generation (for example, *.template.dart for angular-based project), processAsset will be called recursively for all assets in dependency tree. For example, https://github.com/dart-lang/angular/blob/master/angular/lib/src/core/metadata.dart will be processed, if we generate template for @Component. I understand, that there is no second read from disk, and dependencies calculated only once, but method processAsset will be called so mush times while building large project, that only this fact causes huge performance loss (up to 30% of build time)
It seems that all heavy code is the same for build_resolvers 0.1.1
Yes, we started with a wholesale copy because we want to add performance optimizations that are not possible with Transformers (and thus couldn't be added to code_transformers). Most of those optimizations will be based on the knowledge that within a build an asset can never change. We also wanted to drop the barback dependency and a lot of wrapping we had to do to interoperate with barback.
For each asset generation (for example, *.template.dart for angular-based project), processAsset will be called recursively for all assets in dependency tree.
Yes, this is accurate, and working as intended (today). Angular needs to depend on all transitive assets today for somewhat complicated reasons, some of which we may be able to avoid in the future but it isn't clear. As a concrete example, because of the way that type inference works, if angular wants to know the type of some field it potentially needs to know information from transitive all deps. The actual scenarios where this is required are not all that common but it needs to work. That being said we can and are looking at things we can safely optimize here.
Longer term we can reduce the number of inputs to any given action by using analyzer summaries instead of dart sources and creating larger modules than we do today, at least for non-path dependencies. That has its own drawbacks though (extra build steps to create the summaries, can't do fine grained edits to the analysis context so we have to create an entirely new one for each phase, etc).
@yury-yufimov I am not sure I understand your reasoning about processAsset being invoked O(n^2) times. It uses visited set to avoid revisiting assets, so it should only be invoked once per-asset, meaning it is linear. The only way I see for it to be quadratic is if _performResolve is invoked once for each asset too. Is that what is happening?
Maybe build needs to have a benchmark - e.g. we can autogenerate some big angular app with a lot of dependencies and see how internal build algorithms scale.
@mraleph , I've mean time for whole-project build, and _performResolve actually is called for every component generation. If componentA and componentB both depends on library class MyHelper, then this class, and all of its (transitive-)dependencies will be processed twice.
I do like the idea of autogenerated big project for tests. I think performance will loss nearly-quadratical
The only way I see for it to be quadratic is if _performResolve is invoked once for each asset too. Is that what is happening?
If you run codegen which requires an analysis context on every file - then yes. There are ways to potentially optimize this, but its going to require a lot of work.
It is important to remember that we are optimizing for incremental edits, not full rebuilds. This requires that we have fine grained knowledge of the exact inputs to each action. The best path to speeding things up here is and likely always will be limiting the total number of inputs, and that is where I would like to focus.
Maybe build needs to have a benchmark - e.g. we can autogenerate some big angular app with a lot of dependencies and see how internal build algorithms scale.
Agreed, we have recently started more performance related tracking and would like to get some official benchmarks in place. However, auto-generating a large app which actually reflects a real user created app is extremely difficult. The actual performance depends greatly on how people structure their apps, things such as how large are individual packages, and how fine grained are the imports (huge packages with a single import are the worst case). It also depends a lot on developer workflows and how deep in the app structure they are typically making edits. The deeper the edit, the more actions will be invalidated as a result.
The unfortunate reality is that pretty much all guidelines we have given people in the past actually encourage the worst case scenarios here (specifically, the recommendation to have a single import for your package, with the same name as the package, and not prefixing imports). These recommendations have typically traded build time performance for developer productivity, and also force us to rely very heavily on full app optimizations such as tree shaking. In a world with few build steps and running on the vm for development this is only an issue for huge apps. However, it is in direct conflict with the modular/incremental build tools we are now producing such as dartdevc and analyzer summaries which can't perform those types of optimizations.
All that being said, we could generate a few different apps that highlight the worst and best case scenarios, which would allow us to provide better recommendations in the future (and back them up with actual numbers).
I should also add that we've advised a single package and a single import for libraries, we've never really given any advice to developing applications. What I mean is we could save and re-use caches for dependencies/libraries much longer than we can for your app (which you are hacking on).
So an application with a single lib/... import is probably not a great idea.
We've made some improvements here. We are better about when we inform the analyzer about file changes, and we have a new API Resolver.compilationUnitFor which does not need to do a transitive import crawl if a builder only needs AST for a single library.
I also just did add an optimization here for dependencies from other targets in the latest build_resolvers. If you have a nice package (or target) structure set up, it is significantly better.
I don't see us being able to optimize further though.