esbuild
esbuild copied to clipboard
Incremental build / watch mode should be probably taken more seriously
Please don't take this as a complaint.
My understanding is that esbuild is trying to demonstrate that bundling can be much faster than it currently is with existing tools. While the full build time is super important, for development, the incremental build time is also very important and it would be great if we could demonstrate that a full rebuild could take just a few ms even for complex and big projects.
For this to be demonstrated, I believe the watch mode should be taken more seriously rather than saying "hey, we're super flexible and you should use whatever watching library you may see fit". Unfortunately things are not that simple. After reading several issues, it seems to me that the main problem is that the Go language doesn't provide a good cross-platform library to watch for file changes. Although I understand this is a hard problem, it's not really an impossible one and it's common to find great supported libraries for other programming languages. Maybe it would be interesting to research a bit more for watch libraries for either Go or JavaScript and get them to work because they provide some important advantages over the current watch mode.
Not only a polling-based approach is significantly slower on big projects but the current watch mode only detects that something has changed as far as I can remember from the issues I've read on this topic. It's important to know which files exactly have been changed since the last successful build for properly implementing caching mechanisms.
Let me provide a concrete example that would demonstrate the importance of proper caching.
In our application, we support a few different themes, for different clients. We implement that by providing a separate CSS bundle for each theme. In Webpack, we use the mini-css-extract-plugin to generate those CSS bundles. They share a lot of code in common but this isn't an issue since only one theme is supported for a given client. This plugins will concatenate all imported CSS in the right order for each entry point and bundle them into a single CSS file per entry point. We have one entry point per theme and that works fine.
I couldn't find a plugin like that for esbuild, so I was thinking about how to implement one and then I realized that it couldn't be implemented in the most efficient way due to how the watch/incremental mode is implemented.
Ideally, if none of the dependencies of the themes have changed, we shouldn't have to regenerated the CSS bundles for the themes. Unless we have the list of removed/modified/new files, the only way to check for modifications is by caching the full content of the files and checking if their contents have changed. Doing that every rebuild is way more expensive than checking if any of the dependencies have changed by comparing the paths to the list of modified/removed/new paths provided by proper watch libraries available out there. Bundling those CSS is not trivial. Besides concatenating all CSS content, we must change the paths in url(path) accordingly and finally generating a new hash for the bundled CSS (when hashes are enabled), which can add to the total rebuild time significantly enough to make it look worse in incremental mode when compared to other competing tools available out there.
I really like the idea of pushing the build performance forward in the bundling arena, and I think it would be awesome if a proper incremental API would be available on esbuild to allow for better caching strategy.
Please let me know if my examples are confusing. I can certainly ellaborate on this topic.
I agree that polling is basically unusable from a performance perspective.
IMO a decent workaround for this that doesn't require the development of a full-blown filesystem watcher in Go would be for esbuild to expose an API for telling it when specific files change, so that we could then wire together an existing watcher on Node's side, like chokidar, and pipe its events to esbuild essentially, so that it can know about granular changes without really needing to implement watching itself.
As a workaround to the workaround chokidar rather than passing events to esbuild could tell it to rebuild the whole thing, but as @rosenfeld highlighted ideally you'd want to leverage your knowledge about the changed files and be smart about rebuilding, especially if we are talking about something like esbuild which is our glimmer of hope in this world of slow-ass alternatives.
As a workaround to the workaround chokidar rather than passing events to esbuild could tell it to rebuild the whole thing
This works really well actually, at least for my really small codebase.
For my particular scenario, I ended up rewriting my CSS in a way to make it simpler and I no longer need any processing other than converting from SASS currently. The only missing bit in esbuild preventing us from using it in production so far is the lack of a working code splitting logic, which would force me to deliver a bigger bundle if we were going to switch to esbuild today in production, so I'm only using esbuild to compile the sources using webpack currently, and the time dropped from 30s to 15s or something like that for the production build (a full build with esbuild only finishes in under 2s). But I had to rewrite hundreds of imports to use relative paths in order to make it easier to port the build to esbuild, but as a nice effect, by doing that my webpack build dropped already from over a minute to about 30s surprisingly. I'd love to see my build finishing in under 2s soon, so I'm keeping an eye on the code-splitting feature in order to switch as soon as possible.
Anyway, I still see value in making the incremental build as smart as possible and the only way to properly cache some calculation would be through an API that would tell which files have changed. I'm just saying that from a priority perspective, supporting code-splitting would probably come first, as it's far more important to being able to provide faster page loading by delivering less code initially, than allowing the build to finish sooner, in my opinion...
The other pain point I faced when moving from Webpack to esbuild was supporting jquery-ui (yes, it's a very old project we still maintain to these days). jquery-ui uses AMD, which is supported by webpack, but not by esbuild, so I had to basically import every single dependency of the jquery-ui components we rely on, but it's doable, and this was the only project I had issues with when migrating from webpack to esbuild, so I wouldn't say this would be a high priority feature...
I like the idea of using a Node filewatcher implementation in order to drive the Go API. If this is easy to implement, maybe it makes sense to deliver this before fixing the code splitting feature, which is probably much more tricky to properly implement.
I am currently exploring this for a toolkit I'm building on top of esbuild, and here's how I'm handling it:
- Use
{ write: false, metafile: true }to get back specific input files and output files from the build command - After building, figure out what input files are 'incoming' (i.e. new since last build) and 'outgoing' (i.e. no longer need to be watched), and trigger the respective
watcher.add()andwatcher.unwatch()events on my own file watcher (e.g. on chokidar) - If writing build output is desired, loop through all outputFiles and emit them myself
This way, the only files that get watched are those in your input tree (even if it's deeply nested within plugins and such, such as an import of a JS file within a Svelte file). This works perfectly except that the entire structure of the input is held in virtual memory. This is fine for simple applications but with larger projects (where you may have to import a 500mb video, for example) it simply isn't feasible.
To solve this esbuild could emit the input/output file JSON from the API without requiring { write: false, metafile: true }, so that the same feature can be used with file writing.
We've seen a huge improvement in watch/polling performance in the last month and a half @fabiospampinato. Might be worth checking out again.
@geoffharcourt I believe esbuild is still using polling though? Polling just fundamentally doesn't scale performance wise, there's no way around that, it just shouldn't be used to begin with.
It does. FWIW, with 3k TS source files we saw significantly less CPU use with esbuild in watch mode (as well as no more runaway processes) after changes this quarter than we did with some solutions we tried with Chokidar and watchexec. (Our implementations may have been somewhat naive.)
I realize if you've got 10k+ source files or something in that range that it might be a bigger resource drain.
There's no way that watching properly is slower than watching by polling.
I think you are measuring that wasting a huge amount of time polling but rebuilding intelligently is actually better, for your project, than watching intelligently but wasting a huge amount of time rebuilding everything.
The obvious optimal solution is that both watching and rebuilding should be done properly, which I don't think it's currently possible given that esbuild uses polling and doesn't exposes APIs for doing the watching with a proper tool for that either.
Yes, Incremental builds are a hard problem. Check https://jmmv.dev/2020/12/google-no-clean-builds.html for potential solutions
In the context of this question. Would changing the entry point set work for incremental builds? I would like to setup something that starts with a certain set of entry points and then is able to quickly create bundles for subsets of this entry set.
Esbuild has apparently good support for incremental builds (https://esbuild.github.io/api/#incremental), so a more production-grade filesystem watcher can just be put in from of it easily and with good performance.
I too found the current watch mode lacking in several ways:
- It uses polling instead of inotify, and thus is very CPU intense and laggy
- It does not watch the artifacts handled by the
copyplugin, i.e. manifest.json and *.html files. - The initial "build finished" log message from context.watch() comes way too early, before it even started to assemble the dist/ directory. This is very confusing. (Logs for subsequent builds were correct).
In https://github.com/cockpit-project/cockpit-podman/pull/1243 I implement a custom watch mode, which is not that hard -- node's fs.watch() does the inotify bit, and it should be available on all interesting platforms.
Would changing the entry point set work for incremental builds? I would like to setup something that starts with a certain set of entry points and then is able to quickly create bundles for subsets of this entry set.
I have a use case that requires this right now, currently implemented using an external file watcher. I haven't been able to find a way to add/remove entry points with the current API, so I've been doing full builds every time the entry points change.
This works fine for now since esbuild is ridiculously fast even for full builds, but at some point there will be too many files and the experience will start to break down without incremental builds.
In the ideal world, I'd like to be able to supply an entrypoints glob for esbuild to watch and add/remove entry points for me automatically, with an efficient watching mechanism and incremental builds. But in the mean time, being able to change the entry points set in the rebuild() call would make incremental builds a lot more usable for me.