build
build copied to clipboard
Scalable builds
The benchmark I'm adding in #3802 shows the problem pretty clearly,
| json_serializable | |||
|---|---|---|---|
| libraries | clean/ms | no changes/ms | incremental/ms |
| 1 | 21573 | 2840 | 4278 |
| 100 | 23040 | 2974 | 6327 |
| 250 | 28098 | 3285 | 12578 |
| 500 | 42940 | 4288 | 35061 |
| 750 | 69483 | 6561 | 67670 |
| 1000 | 115941 | 9178 | 119308 |
notice the incremental build time for 500 libraries and 1000 libraries: it increases from 35s to 119s, an increase of x3.4. For double the number of libraries the time increase should be x2 :)
The benchmark also runs for built_value, freezed and mockito, numbers on the PR, the story is pretty similar, except:
built_valueandjson_serializableare about 2x as slow asfreezedon the big build, it turns out this is purely because they use shared parts, filed #3803 for that sub-problemmockitois about 2x as fast asfreezed, I believe that is because mockito adds a part file to the test where it does not become a dep of all the other generators; converselybuilt_value,json_serializableandfreezedall add part files in the app code, doubling the number of deps of all the generators
Apart from these multipliers the numbers are very similar, I think it does not matter much what the generators are actually doing.
@davidmorgan I tried cached_build_runner and here are the results cc @tenhobi
dart run build_runner build --delete-conflicting-outputs
The result:
dart run cached_build_runner build
The result:
dev_dependencies:
build_runner: ^2.4.14
cached_build_runner:
path: /Users/amr/Desktop/cached_build_runner/
@amrgetment Thanks, but caching is not relevant to this issue: it just hides the problem.
@davidmorgan Caching is still part of the solution, but faster builds alone won’t fully solve the problem.
In a real use case, the user might add changes, such as a login feature. However, they typically wouldn’t add multiple features—like reset password, change password, logout, and registration—all at once before running a new build_runner command.
Let's assume faster builds improve performance for 1,000 libraries, reducing the build time from 119 seconds to 30 seconds. With caching, this 30-second build time is further reduced to just 10 seconds.
This results in a 10x improvement overall:
- 4x from faster builds
- 3x from caching
mockitois about 2x as fast asfreezed, I believe that is because mockito adds a part file to the test where it does not become a dep of all the other generators; converselybuilt_value,json_serializableandfreezedall add part files in the app code, doubling the number of deps of all the generators
Going back to this:
This could be because freezed starts by asking for the AST of the generated file.
I need the AST, and build_runner doesn't give it to me. So I have to ask for it again. And I don't think it's cached, right?
If build_runner could expose us the AST, that'd be cool. Most of my generators use it.
mockitois about 2x as fast asfreezed, I believe that is because mockito adds a part file to the test where it does not become a dep of all the other generators; converselybuilt_value,json_serializableandfreezedall add part files in the app code, doubling the number of deps of all the generatorsGoing back to this:
This could be because
freezedstarts by asking for the AST of the generated file. I need the AST, and build_runner doesn't give it to me. So I have to ask for it again. And I don't think it's cached, right?If
build_runnercould expose us the AST, that'd be cool. Most of my generators use it.
The performance issue I'm looking at is just due to the deps graph, it's nothing to do with what the generator does. When you look at the CPU flame charts of the large benchmarks, the generator barely shows up.
Once that's solved--yes, I think there's a possibility of handling ASTs better/faster, we'd have to compare :) thanks.
notice the incremental build time for 500 libraries and 1000 libraries: it increases from 35s to 119s, an increase of x3.4. For double the number of libraries the time increase should be x2
This is a computer cache issue. You first run out of L1 cache, then L2, then L3 - each time the costs grow exponentially. So x3.4 is not that bad. :-) (We ran some experiments with @eernstg in https://github.com/dart-lang/language/issues/2727, and also wondered where the perf degradation comes from). In the context of build runner - it might be that the program accumulates something in a map and fails to clean up unnecessary data. (But it might be more complicated than that) FWIW.
notice the incremental build time for 500 libraries and 1000 libraries: it increases from 35s to 119s, an increase of x3.4. For double the number of libraries the time increase should be x2
This is a computer cache issue. You first run out of L1 cache, then L2, then L3 - each time the costs grow exponentially.
The build_runner performance issues are unrelated to caching, or to failing to clean up data. The problem is to do with doing unnecessary duplicate work when many generator runs depend on the same large graph of transitive imports. Work in progress, but I think it should not be too long now :)
+1 Just to let you know, this is an issue for me as I have a CI pipeline taking more than an hour to generate. Glad you are working on this!
definite issue for us. Big project. Mono repo. Almost unusable!
The other problem this shows is how slow the baseline performance levels are. 22s for first-run, and 6 seconds for incremental, with a single library, feel at least 10x bigger than they should be if we're going to get anything near the performance level of built-in data classes, or what was imagined w/ macros.
@esDotDev yes, for sure.
You get quite a lot faster small incremental builds if you use watch mode; I'm not sure yet how much people are forced to use build or are not aware of watch, but ideally I'd like repeated use of build to be just as fast as watch. And for both to be faster than today :)
Filed https://github.com/dart-lang/build/issues/4019 for performance improvements for small builds.
Thanks :)
IMO clean build speed is not nearly as important as the other two columns
It's important, don't mind me. But clean build can be multiple orders of magnitude slower than incremental builds.
Take flutter run.
The commend takes easily 10s of seconds to start the app. That's your "clean build".
But once started, hotreload takes a few milliseconds. That's your incremental build
For that reason, I feel like there's a missed opportunity by not investigating better cache invalidation strategies.
It's cool that we're trying to make writing 1000 files fast. But more often than not, we should just be writing 1 file instead of 1000
For that reason, I feel like there's a missed opportunity by not investigating better cache invalidation strategies.
There's an arc of work happening over in the analyzer to support fine-grained invalidation there
https://dart-review.googlesource.com/q/Fine.
which I expect will help significantly, possibly with some changes on the build_runner side too, we'll see.
I haven't been using watch mode for build_runner for several years. It used to delete generated files (it was looking it, at least), so my code analysis constantly was going red and non-functional while I was editing a file. Probably, worth giving it a try now.
I haven't been using
watchmode forbuild_runnerfor several years. It used to delete generated files (it was looking it, at least), so my code analysis constantly was going red and non-functional while I was editing a file. Probably, worth giving it a try now.
That should work better now; if not, it's high priority to fix :)
@davidmorgan I filed an issue I've experienced with watch mode today. Though I don't experience any constant breakages with analyzer anymore, it still happens occasionally: a syntax error in an unrelated source file breaks analysis for source files with parts and it looks like files are not generated and the previous ones are deleted.
If I manage to reproduce it regularly, I will submit another issue.
@davidmorgan I filed an issue I've experienced with
watchmode today. Though I don't experience any constant breakages with analyzer anymore, it still happens occasionally: a syntax error in an unrelated source file breaks analysis for source files withparts and it looks like files are not generated and the previous ones are deleted.If I manage to reproduce it regularly, I will submit another issue.
Hmmm if the generation fails then I think the files do get removed, which is arguably correct. But also arguably it might be more useful, particularly in watch mode, to leave them there. This is something we can look into :) filed https://github.com/dart-lang/build/issues/4027
Excuse me, could you tell where or when this will be available? I couldn't find this information
Excuse me, could you tell where or when this will be available? I couldn't find this information
Do you mean the release of the refactor for performance? Soon, hoping to publish new versions to pub next week.
This specific scalability issue is resolved, and released as build_runner 2.5.0 so I'll close the issue.
I have a pile of further performance improvements that I'll be working on next.
Benchmarking with the previous release is a little tricky, it can't actually run the biggest sample projects because of a stack overflow :) for 500 files json_serializable we have
before: 28s initial, 400ms clean, 28s incremental
after: 13s initial, 2s clean, 9s incremental
and then checking scale up by 2x to 1000 files with the new version 16s initial, 3s clean, 16s incremental
and then another 2x to 2000 files 39s initial, 7s clean, 37s incremental
this is still slightly worse than linear: there is a scalability issue with the shared parts builder that I'll look into on https://github.com/dart-lang/build/issues/3803