build
build copied to clipboard
Better IDE integration
If use of build_runner is expected to increase with macros going away, it could be good to have better integration with IDEs. Currently in VS Code we have some tasks that make it easier to start build_runner, but they're basically just run as shell commands for convenience and have no real interaction with the editor. Presumably if you have multiple projects that use it, you might end up running many processes too.
It could be nice if there something more like the analysis server... a single process the IDE can spawn that can handle multiple projects in the workspace (workspace in the VS Code sense, not Pub Workspace... since a user could open a folder that has a mix of workspace + isolate pub packages, and a single build_runner process for all of them would probably be better).
I don't use build_runner much so I don't have any current ideas about what else could be better integrated, but maybe this issue would be a good place for others to add some ideas.
(I'll also note that the IDEs are spawning the Dart Tooling Daemon which knows what the open roots are... I don't know to what extent it might make sense to integrate via that... on one hand it would be nice to not be spawning so many separate processes.. but I'm also not sure DTD should turn into some kind of god-service).
This, to me, is the dream goal. If we could have a streamlined auto build_runner watch experience and something like (Don't require "part file.*.dart" for files in the same directory with the same base name. #60018l)
Than we'd effectively have 99% of the advantages of macros without any of the downsides.
One area I have been exploring is an analyzer plugin, which you would use instead of build_runner (see this branch). I am waiting on some features in the SDK to land to help with some of the more pathological cases here since the plugin gets invoked on every key stroke, so you really want to be able to cancel pending async work.
But, I do think it has a lot of really great advantages even as it exists today:
- Faster (no dependency tracking of its own, which is where most work goes in build_runner for big projects)
- Simpler (no phases or ordering, everything just kind of runs when it can, in the background)
- Can send diagnostics back to the IDE, which is something I really wanted from macros
- Could do quick fixes to add
part 'foo.g.dart';etc, which should be a good middle ground between not having to write part directives at all and the current state.
- Could do quick fixes to add
- Automatic (runs as a part of analysis server, no extra tool)
yeah, that would be amazing!
I think per keystroke would a bit of overkill (cpu intensive), but on save would be great. Diagnostics would be fantastic as well. The big downside of full on analyzer plugin is what happens if a generator crashes? Right now, build_runner watch just exits and you relaunch it when the issue is fixed, but if the whole analyzer went down it would be painful to restart it. (My biggest issue with Macros)
A potential huge win for this would be if a developer could choose between analysis builds and build_runner builds. This would make developing the generator a lot easier since we could use build_runner for breakpoints and variable exploration, and analysis server for generators we're not debugging.
Thanks Dan.
Yes, we definitely want better IDE integration for codegen :) so I removed the question mark from the issue name ;)
There's lots of discussion but nothing concrete on this right now, I will share if/when we have something. And, yes, input welcome :)
I think per keystroke would a bit of overkill (cpu intensive), but on save would be great.
Yes I agree per keystoke isn't really what you want - however plugins today can't distinguish I don't think. That doesn't mean they couldn't in the future though.
Diagnostics would be fantastic as well. The big downside of full on analyzer plugin is what happens if a generator crashes? Right now, build_runner watch just exits and you relaunch it when the issue is fixed, but if the whole analyzer went down it would be painful to restart it. (My biggest issue with Macros)
The whole analyzer wouldn't go down, but the plugin might, so you would have to restart the analyzer to get the plugin working again. In general plugins are quite painful right now to develop, so lots of kinks to work out there, but I think they are pretty nice to use as a consumer.
A potential huge win for this would be if a developer could choose between analysis builds and build_runner builds. This would make developing the generator a lot easier since we could use build_runner for breakpoints and variable exploration, and analysis server for generators we're not debugging.
Yes an option is definitely to just let people choose between the plugin and build_runner. For example for Dart web projects they have to use build_runner for compilation anyways and probably would just stick with that for codegen too.
How about an analyzer plugin that talks to build_runner?
You might be able to get an analyzer plugin going that talks to the build_daemon - build_runner itself today doesn't use build_daemon though but it really should be.
The APIs also really don't match up well for existing plugins - they expect to be able to ask for diagnostics for a file on demand based on the in memory contents (in the IDE), where build_runner operates based only on disk contents.
I think the only point of it would be to surface the diagnostics which would be nice but might not be worth supporting just for that.
One of the critical questions here is this: In an ideal world, would our code generation support (whether that's based on build_runner or not) be able to generate code based on unsaved changes in the IDE (independent of the mechanism that triggers code generation), or would it only be able to generate code based on file content saved to disk?
I suspect that we want to support the former, but I don't work with code generators enough to know for sure.
How about an analyzer plugin that talks to build_runner?
I could make one if you want. I have quite a lot of experience with both generators and plugins. That'd be interesting to look into
One of the critical questions here is this: In an ideal world, would our code generation support (whether that's based on build_runner or not) be able to generate code based on unsaved changes in the IDE (independent of the mechanism that triggers code generation), or would it only be able to generate code based on file content saved to disk?
I suspect that we want to support the former, but I don't work with code generators enough to know for sure.
Given the fact that even hot reload doesn't work on unsaved changes I think this is out of scope. Especially with error generation it could even get annoying. The other thing too is that if code generators have to write files to disk than there's a reasonable expectation that they would be based off of the file thats on disk as well.
I've got a working POC of an analyzer_plugin that generates code.
Here's a video of an analyzer_plugin that generates a mixin with a comment listing all fields in the object. (It's a dumb generator) The generator is used by the top-right file, and the generated code is in the bottom right.
You can see that as I modify the left file, the generated code was updated directly (even though the change was unsaved)
It's difficult to tell whether this is a better user-experience though. I only have a small test project. Although the update is fairly instantaneous in this limited context.
https://github.com/user-attachments/assets/ebb0f5c1-1348-47bd-8879-0f8da28f9dcf
actually, IDE support is a brilliant solution for build runner
on IDE instead of using build_runner build--build-filter I will just tap by mouse on build and voilà it works only for the file I changed
I've got a working POC of an analyzer_plugin that generates code.
Here's a video of an analyzer_plugin that generates a mixin with a comment listing all fields in the object. (It's a dumb generator) The generator is used by the top-right file, and the generated code is in the bottom right.
That's definitely an interesting thing to experiment with :)
Does your experiment use the generators directly or via build_runner?
My guess for how this should work in the end--if we go in this direction--is that it should use build_runner, and that work happens out of process. So rather than pulling the build system into the analyzer--which was a problem with macros--it would just call out to the build system.
It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.
Assume that build_runner performance issues will be solved :) and then it's useful to pull in as you get consistency with other ways of doing builds, and all the generator config for free.
Re: generation on in-memory files or on save, I find it hard to form a strong opinion here because I always save very frequently while editing to trigger dartfmt. Generating too often anyway doesn't make a lot of sense--it will cause churn in the output.
actually, IDE support is a brilliant solution for build runner on IDE instead of using
build_runner build--build-filterI will just tap by mouse on build and voilà it works only for the file I changed
Code generators using the analyzer element model can't actually work one file at a time in any meaningful sense.
They always run on top of fully analyzed code, which always (today) requires reading any transitive deps that changed.
That's the fundamental thing that needs addressing re: fine grained invalidation, which is much harder than a UI issue :) ... see #3810 "more granular invalidation".
It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.
Analyzer plugins do not generally block anything, they run in a separate process and can provide diagnostics whenever they are ready. This is a key part of why I actually think they are a great solution. It is effectively equivalent to build_runner in this regard.
They always run on top of fully analyzed code, which always (today) requires reading any transitive deps that changed.
This is most certainly true, but the tracking of that invalidation (the slow part today) is handled a lot more efficiently in analyzer than in build_runner. And running the builders is also actually typically quite fast, so it isn't actually a large issue.
It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.
Analyzer plugins do not generally block anything, they run in a separate process and can provide diagnostics whenever they are ready. This is a key part of why I actually think they are a great solution. It is effectively equivalent to build_runner in this regard.
They block other plugins; if you make a build plugin that does build work in-process then you have now made all other plugins worse.
But I don't think that's a problem--it should be possible to farm the work out to another process. It would just be triggering builds and feeding back the output.
Maybe analyzer plugins should not be a separate process from the main analyzer: maybe they should just have a hard limit on execution time, to force that they do any work out of process.
Considering how memory expensive Analyzer is, IMO the fewer processes using Analyzer we have, the better. So I think it could make sense to generate code directly from a plugin (or the main analyzer if possible).
They block other plugins; if you make a build plugin that does build work in-process then you have now made all other plugins worse.
Are you sure? Afaik each one is its own process (which is its own problem, but it does avoid this issue). The next version of plugins is supposed to resolve this though (so they will be all in one process).
But, either way it is pretty trivial to write it in such a way that it yields often to other things wanting to do work.
In the legacy plugin support every plugin ran in its own isolate. This solved a lot of problems, but used too much memory because every isolate had its own copy of the analysis results.
In the experimental plugin support all of the plugins run together in a single isolate. The analysis results are duplicated, but only once for all plugins. This helps with the memory use, but does mean that every plugin can impact the performance of all the other plugins.
The way we use the analyzer in google3 solves all these problems, we should probably do something similar.
The analyzer supports working from summaries on disk, which means it can quickly spin up, do some computation, write the results and exit. And, it also supports "worker mode" for a persistent process for cases where you need to repeatedly do work on the same data. You can dynamically spin up and spin down worker processes based on available resources.
It's very fast and it's easy to tune to the amount of RAM you want to use.
Every machine has multiple cores, we need to get away from single process builds. We definitely should not build a new way of doing builds that is designed as a single process.
That sounds fine for the command-line analyzer, but I don't (at least yet) believe that this will work for the analysis server's use case. I don't think we can't spin up a new analysis server every time the user wants code completion results, for example. I'm skeptical that it could deserialize the data from disk and compute the results within the required time frame, especially when taking into account the time for the VM to start running the server's code and the initialization process to connect the server to the client.
You and I need to talk.
That sounds fine for the command-line analyzer, but I don't (at least yet) believe that this will work for the analysis server's use case. I don't think we can't spin up a new analysis server every time the user wants code completion results, for example. I'm skeptical that it could deserialize the data from disk and compute the results within the required time frame, especially when taking into account the time for the VM to start running the server's code and the initialization process to connect the server to the client.
You and I need to talk.
I meant as a way to run generators--not necessarily moving any existing functionality.
Personally I'd:
- merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory. Users tend to have 8Go of ram used by Analyzer, and another 8Go for plugins ; even if all plugins run in the same isolate. It also makes Analyzer more responsive. Currently, plugins and the main Analyzer isolate boot sequentially. Analyzer start, then it spawns plugins ; which themselves need to start. That makes plugins available a few seconds after Analyzer.
- If we're worried about plugins locking Analyzer, I'd have the IDE tell users "the plugin
foois unresponsive". This should be more than enough for users to not blame Analyzer but the relevant plugin. - Code-generators could be an analyzer plugin. This could make generation faster.
And it has the cool benefit of enabling generators to emit diagnostics and fixes.
I really don't think the "Plugins could do an infinite loop" is a concern.
We just need to relay to users that a plugin did a stupid thing.
That's not really different from, say, if a lint rule in the SDK took 2 seconds to complete. Analyzer already knows which rule too longer to complete than others.
I agree with most of that--except, it's okay to have codegen coordinated by an analyzer plugin, but it should not run in an analyzer plugin. There is no build system that would prefer to be forced to run in a single process it does not control :)
If running in an analyzer plugin is optional, I don't think that's an issue.
We would still have a CLI to manually run generation. And we'd likely have options to enable/disable generation through plug-in.
Macros were some sort of analyzer plug-ins to begin with.
If running in an analyzer plugin is optional, I don't think that's an issue.
We would still have a CLI to manually run generation. And we'd likely have options to enable/disable generation through plug-in.
Macros were some sort of analyzer plug-ins to begin with.
One of the problems with macros was exactly this: macros were not fast enough to run "inline" in the analyzer, by which I mean, blocking whatever you do next with the analyzer.
So it's very much the same problem. Whatever is integrated into the analyzer's main loop, has to be fast; builds are not always fast. So it needs to be coordination of the build that's integrated, not the build itself.
I would push back and say I do think running builders inside an analyzer plugin is actually very reasonable. It results in the smallest amount of total work being done.
If you look at the actual work a builder wants to do compared with a typical custom lint, its actually roughly the same, maybe even less in a lot of cases (they don't look at the full AST just the element model often).
So, anywhere it's ok to run plugins it should also be ok to run builders, from a performance perspective.
The one caveat to that is that we don't want to thrash the analyzer too much by emitting files in the middle of analysis - but I have been playing around with a way to cancel work that was previously scheduled and that seems to mostly solve the problem (basically, whenever a plugin is told about some changed files this allows it to stop its previous analysis - or code generation - of those same files immediately).
Also, once you have built all the files in your app then the amount of actual on disk churn is low anyways.
merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.
If you have ideas for how to do that I'd love to hear about them and what the tradeoffs would be.
Analyzer already knows which rule too longer to complete than others.
We do have the ability to measure the time taken by a lint, but we currently don't measure it in production so we can't use it to report problem lints to the user.
... it's okay to have codegen coordinated by an analyzer plugin ...
I look forward to hearing more about what kind of coordination you have in mind. But I agree with Jake that it isn't necessarily a bad idea to have code generation be running in a plugin.
The one caveat to that is that we don't want to thrash the analyzer too much by emitting files in the middle of analysis - but I have been playing around with a way to cancel work that was previously scheduled and that seems to mostly solve the problem (basically, whenever a plugin is told about some changed files this allows it to stop its previous analysis - or code generation - of those same files immediately).
We had hypothesized a slightly different approach (but haven't yet had time to try it). We thought it might make sense to add a small delay to the code handling watch events. Right now, when a watch event is discovered we immediately start analyzing the potentially affected files. But if we were to wait a short time, resetting the timer every time an event comes in, then we'd presumably do less work overall without a significant additional latency.
Not only would this potentially deal well with builders, it would also presumably improve our behavior around git operations that often change large numbers of files.
So, anywhere it's ok to run plugins it should also be ok to run builders, from a performance perspective.
That doesn't scale. It might be fast for some examples, but it's not always going to be fast.
It does not make sense to design a combined build system and interactive tool that simply becomes unusable when you hit a certain code size / complexity. Build systems must scale.
If there are problems making it work as we want, let's fix those, but the end goal needs to be that builds never block interactive actions.
merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.
I'd worry that running more code in the main isolate can only worsen responsiveness in the editor. I feel like recently I've seen fewer complaints about memory usage and more of "When I try to save in VS Code, the formatter blocks the save for a long time" (admittedly, I mostly see VS Code issues so that doesn't necessarily mean it's a more common issue). In most cases this is unrelated to the formatter but rather that the format-on-save request is held up by other code tying up the isolate (I think often long analysis caused by large library cycles).
It's unfortunate that the format request is held up this way (it's pretty disruptive in the editor), because either a) it produces no changes in which case the delay was for nothing or b) it produces changes that the editor applies and then triggers the analysis again.
Sometimes I wonder if the solution is more - not fewer - isolates 🙂. The formatting is done on a string (which the formatter then parses) and doesn't require any analysis results/resolution info. Probably it could be done on another isolate and keep Saving* more responsive, but it would also require the messages from the client were processed in an isolate that wasn't tied up by the analysis (I don't know how big that change would be, or how much complexity it would add).
* Some users enable fix-on-save which unfortunately does require a resolved AST and therefore having format on another isolate would likely not change anything for them.