gosling.js
gosling.js copied to clipboard
Improve build + packaging; make higlass external?
Motivation
Currently the npm distribution for gosling.js includes:
dist/
├── 1.gosling.js
├── 1.gosling.js.map
├── 2.gosling.js
├── 2.gosling.js.map
├── 3.gosling.js
├── 3.gosling.js.map
├── 4.gosling.js
├── 4.gosling.js.map
├── gosling.js
├── gosling.js.map
└── worker.js
All of which are minified, UMD-like exports. It seems {1,2,3,4}.gosling.js
are chunks for dynamic imports that are aliased by __webpack_require__
in dist/gosling.js
, and dist/worker.js
is a separate build entirely that ends up being referenced in the source code via raw-loader!../../dist/worker.js
.
Ideally, an npm package should provide an ES module entrypoint (which is easier to statically analyze and treeshake) rather than a pre-built & minified UMD. In my experience, webpack isn't a great choice for a library because you can't target ESM as an output and the generated bundles are difficult to load with by bundlers (e.g. __wekpack_require__
statements cannot be resolved, so any project using gosling.js
as a dependency will likely be unable to load {1,2,3,4}.gosling.js
).
I have experimented with a new build for gosling.js
on my vacation and wanted to share some ideas.
Approach
-
Bundle
gosling.js
the library (src/index.ts
) withesbuild
orrollup
and target esm output (as well as UMD bundle like before for use in script tags) -
Consume the library
gosling.js
in theeditor
(rather than importing fromsrc/
). Since the editor is really an "app" and not distributed in the npm module, it would make sense to configure this separately from the library. This is similar to what we do in Viv, wheresrc/
corresponds to the library andavivator/
is an app that consumessrc/
as a module. This enforces that the editor only use the public API from gosling.js.
src/ # bundled with rollup/esbuild
editor/ # built with webpack, imports generated bundle
- Make
higlass
an external dependency. Since the CSS for higlass is already required via a script tag, I think it would also make sense to also externalize higlass. I'm not sure this is something that has been discussed, but since higlass is distributed as a UMD bundle, we could similarly add it as script tag along sidereact
,react-dom
, andpixi.js
for use with our UMD bundle.
Vite might be an option to unify the two first points (which is what we have done in Viv). One thing I've noticed is that @gmod/*
modules are very hard to bundle outside of webpack <4 ecosystem (lots of node-builtins with conditional runtime checks). It would be nice to bundle just these difficult to bundle dependencies for our ESM module, and then leave everything else external for bundlers.
@manzt thank you so much for these insights!! I wanted to reorganize the bundling part and separate the editor from gosling.js at some point, and this looks to be great timing.
I agree with all your points. Since I don't have much experience with these, I have some follow-up questions.
- With the proposed changes, I guess we can still test the local changes of gosling.js with the editor (e.g., bundling the updated gosling.js before running the editor)?
- We haven't yet deeply discussed whether we want to externalize
higlass
or not. Having the CSS file required, I think it makes sense to externalizehiglass
, although I was also thinking of removing the CSS dependencies at some point. Honestly, I was not sure about the rationale of how people determine which packages to externalize and which ones to include (e.g., react, react-dom, pixi.js). Would the size of the resulting bundle be the main reason? - To make ESM modules for
@gmod/*
, doing it in a forked repo would be the most ideal way?
With the proposed changes, I guess we can still test the local changes of gosling.js with the editor (e.g., bundling the updated gosling.js before running the editor)?
This is the trickiest part. Something like Vite
might offer a way to unify the entire process, but I've run into some headaches trying to get that to work unfortunately (very slow startup/build time). My idea (and experiment) is as follows:
-
bundle
src/index.ts
withesbuild
and generate ESM outputdist/gosling.mjs
. Rollup is also an option but esbuild is sooo much faster, so if we can get that to work I think it would be a huge bonus during development. -
in the editor webpack config, set an alias in the webpack config:
// webpack.config.js
resolve: {
alias: { 'gosling.js': path.resolve(__dirname, 'dist/gosling.mjs' }
}
// editor/index.js
import { ... } from 'gosling.js'
- during production, we then need to run:
node build.js && webpack --mode production # build js output _once_ along with site
- during development, we can use a tool like
concurrently
to run both processes in unsion:
concurrently 'node build.js --watch' 'webpack-dev-server --mode development'
Any changes to src/
will trigger is a rebuild, which will also trigger an update via the dev server.
Honestly, I was not sure about the rationale of how people determine which packages to externalize and which ones to include (e.g., react, react-dom, pixi.js). Would the size of the resulting bundle be the main reason?
Good question. I don't know if there is a "right" answer because it depends on the target use and format:
UMD
When publishing a UMD format (imported via script tag) you must create a bundle with everything except for what you have marked external. This is because the UMD scripts for other libraries (e.g. react, react-dom, pixi.js, and higlass) add global namespaces to the window
on import so other UMD modules depend on (e.g. React
, ReactDOM
, PIXI
, hglib
). The rest of your bundle needs to include the package dependencies because they aren't added as script tags.
Therefore, a UMD script can be used as long as it's external modules are also on the page. You could in theory externalize everything as long as you had a script tag for each dependency; however not every npm package publishes a UMD version, and adding all those script tags would be a nightmare for an end-user.
All that said, the benefit of making a module external is that a website doesn't need to download multiple copies of the same code, and often to avoid conflicting dependencies. My rule of thumb for UMD is to make the largest, stable dependencies external. These are often peerDependencies
for a project as well, which we might consider for higlass.
<script crossorigin type="text/javascript" src="https://unpkg.com/react@16/umd/react.development.js"></script>
<script crossorigin type="text/javascript" src="https://unpkg.com/react-dom@16/umd/react-dom.development.js"></script>
<script crossorigin type="text/javascript" src="https://unpkg.com/pixi.js@5/dist/pixi.js"></script>
<script crossorigin type="text/javascript" src="https://unpkg.com/[email protected]/dist/react-bootstrap.js"></script>
<script crossorigin type="text/javascript" src="https://unpkg.com/[email protected]/dist/hglib.min.js"></script>
It's just one extra script tag for our examples.
ESM
For an ESM, my advice is to make as much as possible "external". The rationale here is that "target" for this bundle is other applications using gosling.js
as a dependency -- not to be used directly in the browser. In my opinion, this target should optimally just be the combined gosling source. This way a bundler can resolve any shared dependencies and drastically reduce the bundle size:
import * as gosling from 'gosling.js'; // hglib included in gosling bundle
import * as hglib from 'higlass'; // hglib loaded again
import { ... } from 'd3-array';
For example, I know that higlass is already an optional dependency of vitessce, so implementing https://github.com/vitessce/vitessce/issues/955 would currently entail loading higlass twice if a higlass component and gosling component were on the same page.
My final comment (and opinion) is that the goal of the "module"
bundle should be to remove as much of the build-complexity as possible. Bundling an application using gosling.js
e.g.,
import * as gosling from 'gosling.js';
console.log(gosling);
should require almost no bundler configuration, but get all the benefits of treeshaking/shared dependency resolution etc. This means we should compile typescript, remove any weird import statements (e.g. raw-loader!
). Ideally all gosling dependencies should similarly work with nearly zero configuration, but some are difficult to bundle for the browser as mentioned. In this case, I would suggest we do others the favor of including these "difficult" modules in our bundle so that others don't need to go through the work of doing so.
To make ESM modules for @gmod/*, doing it in a forked repo would be the most ideal way?
I would avoid living on a fork. This has been a pain for us with geotiff
in Viv. My suggestion above is that we bundle these "difficult" dependencies ourselves. By including those modules in our "module", we effectively include a fork in our published package for others. There are several options here:
- Create custom resolution in our bundler configuration to deal with these modules.
- Use something like
patch-package
to modify thenode_modules
during development.
These make a lot of sense to me, and thank you for these thoughtful comments again!
I think, after the deadline for the VIS presentation recording which is due Sep 12, I can start working on this.