stork
stork copied to clipboard
Publish Stork UI package to NPM
Step 2 from the web instructions shows an embedded script:
<script src="https://files.stork-search.net/stork.js"></script>
- Is this JS published to npm? What if we want to add it as a dependency to our project so its versioned?
- Do the .wasm file and .js file need to be shipped together? Ideally the wasm file could be published to npm as well.
Hi @styfle! Thanks for writing in.
Stork is not on NPM today. I hope to be able to publish it on NPM at some point, but I've focused on the minimal-installation use case for the 1.0.0 launch. I agree that being able to version the project would be useful for Stork's users.
NPM publishing is made complex by loading the WASM into the Javascript runtime. The Javascript library must reference a specific version of the WASM binary, and today those are published together when a new version is released. Loading Stork from NPM would either require that:
- The JS load the WASM from the Stork CDN, or
- The user host the WASM on their own server.
The first option would require uploading versioned copies of the WASM binary to the Stork CDN, which is possible, just not built into the publishing pipeline today. This would solve versioning, but would not be an appropriate solution for people who want to self-host Stork.
The second option would be equivalent to self-hosting Stork. I've outlined a few of the challenges with that method in the comments of #101, but I haven't ruled out self-hosting entirely.
Ultimately, this is something I want to build, but it will require significant planning and building.
Let me know if you have any more questions!
Best, James
@jameslittle230 Another question for you:
Can I write my own JS to interact with the wasm file? I want to handle my own HTML and CSS, not using the defaults.
WebAssembly.instantiateStreaming(fetch('/stork.wasm'), {}).then(obj => {
console.log(obj.instance.exports)
})
This seems to fail so I'm hoping you could provide the code to call into the wasm file (something thats not minified 🙂)
I inspected the stork.wasm
file and found that I need to specify the imports object.
But both calls to wasm_register_index
and wasm_search
don't work.
const imports = {
wbg: {
__wbg_new_59cb74e423758ede: () => {
},
__wbg_stack_558ba5917b466edd: (e, t) => {
},
__wbg_error_4bb6c2a97407129a: (e, t) => {
},
__wbindgen_object_drop_ref: (e) => {
}
}
};
const { instance: { exports } } = await WebAssembly.instantiateStreaming(fetch('/stork.wasm'), imports);
const name = '/index.st'
const res = await fetch(name)
const buf = await res.arrayBuffer()
exports.wasm_register_index(name, new Uint8Array(buf))
const results = exports.wasm_search(name, 'test');
// results is undefined :(
@styfle -
Building one's own JavaScript on top of Stork's WASM API isn't likely going to be a supported way of working with Stork; the two are meant to work together to manage the registration of an index.
The WASM API itself (including all the wbg
methods) is built by wasm-pack. In order to work with the WASM API, you'll likely have to build Stork from source. If you run yarn build:wasm:prod
, wasm-pack will build an NPM module that includes a WASM binary and an ES6 module in stork.js
. That ES6 module includes all the boilerplate necessary to communicate between Javascript and WASM. If you wanted to handle the DOM yourself today, you'd likely need to build the ES6 module from source and effectively replicate the Javascript parts of Stork.
I sympathize that not everyone wants Stork to manage the DOM for them -- this is one of the reasons I've been seeing bugs on the project's website. Instead of making the WASM API easier to work with though, I plan on exposing more Javascript API methods. Today. stork.register()
downloads the WASM blob, downloads the index, and hooks into the DOM. In the future, I want to include four new JS methods on the stork
object:
- One that downloads the WASM blob
- One that downloads the index and saves it into the WASM runtime's memory
- One that hooks into DOM elements
- One that lets you search the saved index from Javascript, in case you want to build your own DOM management
This project will likely be the next thing I tackle, given the amount of feedback I've been receiving about working with Stork's Javascript library since the project's launch.
Hope that helps, James
Self-hosting support solves a lot of the immediate user need here, but Stork still isn't on NPM.
Unfortunately, hosting a run-in-the-browser, bundle-with-webpack type Javascript package on NPM isn't in my area of expertise, so I'm adding the help-wanted
label with the hope that someone else can submit a PR that would let folks npm install stork-search
and have it do the right thing.
Let me know if anyone's interested in working on this. I'd give more guidance, but honestly, if you know how to do this, you probably know more than me about the experience of installing and loading the library than I do.
It looks like you already published to npm here: https://www.npmjs.com/package/stork-search
Although it currently fails during install with the following:
error Package "stork-search" refers to a non-existing file '"./myapp/storkserach/pkg"'.
I get the same error when running yarn install
in this repo's root.
What other prep did you do to publish your own website?
The WASM library has to be built first, before running yarn install
.
Running yarn build
should get you what you want, I think.
Edit: Yup, I published this project to NPM, but I'm not sure it should be required that somebody builds the WASM blob from source for them to npm install
the project. I knowingly published it in an inoperable state mostly just to claim the project name.
@jameslittle230 Theres a paradox here.
You can't run yarn build
until you have dependencies.
error Command "webpack" not found.
But you can't run yarn install
until the wasm pkg is built.
error Package "stork-search" refers to a non-existing file '"./myapp/storkserach/pkg"'.
Yup, that's not ideal.
I think I fixed that last night in this commit: https://github.com/jameslittle230/stork/commit/5d29d9bdba2eee57389885f4bd1759f5bb9baa76#diff-3d64996115d0ac3b74e36f57ce6b00d6b4982614c41b1ed39eab8ef43b2e9a2b
@jameslittle230 That didn't work, still fails with Package "stork-search" refers to a non-existing file
.
You can reproduce by doing the following:
git clone https://github.com/jameslittle230/stork
cd stork
./scripts/build.sh
Build scripts that rely on ordering are so tricky to develop on, since building the project slows down the debugging loop. Thanks for being patient.
I think I (really truly) fixed this with ac555b8786b8351f0f27732e3cf127dfae8e7736 -- I'm now able to build the entire project in a clean VM with the above reproduction commands. Let me know if it's working for you now.
That does indeed work now 👍
I think the only thing missing from ./pkg/package.json
is "type": "module"
to get it to work with Node.js
What's the difference between ./pkg/stork_bg.wasm
and ./dist/stork.wasm
?
Does the bg
represent something?
Coming back to this.
When we talk about putting Stork on NPM, we're actually talking about two different things:
- A package that builds search indexes from the Node runtime. This package would be installed as part of a JS-based static site generator (Next.js, Gatsby, 11ty).
- A package that loads the searching library onto a webpage. The code in this package would be compiled into the rest of a browser JS bundle using e.g. Webpack, and would provide a nice API that you can call from your site's frontend code to download and register a Stork index onto your site.
The first one is more complicated. The Stork index builder is compiled into a target-specific binary. NPM modules are generally platform-agnostic, meaning any computer (Windows, Linux, Mac) can fetch the JS code and Node will run it. To distribute an NPM package that lets you build indexes from within a Node app, that package would need to distribute binaries for different targets and intelligently run the binary based on what computer is calling the package. (I think this is all accurate, but I'm not an NPM distribution expert - please let me know if I'm wrong.)
OR, we could distribute the builder as a WASM blob, which could (potentially) be run from within the Node runtime agnostic of the platform it's running on, in the same way JS code can be run from any platform. The builder has never been compiled to WASM before, and I'm not sure if it's even possible since the builder calls out to a lot of OS code (to read files, make network requests, etc.) - but that would potentially induce less maintenance burden and would hopefully result in fewer people being upset that the Stork Index Building NPM module won't run on their machine.
The second one will likely be easier to build than the first, since the client-side code is already built with JS. The complication, as it has been from the beginning, is the WASM blob: how do we bundle it with client-side code without inlining it (which would massively inflate anyone's bundle) or without making laypeople experts in serving WASM files with the appropriate HTTP headers? Now that Stork's client-side code can set a WASM URL, each version of the NPM module could be pinned to a specific WASM version, and the NPM module would still load code from my site.
(For extra credit: a React + Stork component, a Vue + Stork component, etc.)
Both are good ideas! Both should exist!
Microlink hosts an npm package for the youtube-dl binaries called youtube-dl-exec
It looks like they use a postinstall
node script to download the binary from the official distribution channels:
https://github.com/microlinkhq/youtube-dl-exec/blob/master/scripts/postinstall.js
It gets triggered from package.json, I assume, when the user installs the package on their system.
Then they just wrap the CLI commands into an exportable function via execa
This means you can use it with a JavaScript friendly Promise-based interface:
import youtubedl from 'youtube-dl-exec'
const output = await youtubedl('https://www.youtube.com/watch?v=6xKWiCMKKJg', {
dumpSingleJson: true,
})
Theoretically, you should be able to match the different target-specific binary URLs to Node's process.platform
like so:
function getBinaryUrl () {
// macOS
if ( process.platform === 'darwin' ) {
return 'https://files.stork-search.net/releases/v1.4.2/stork-macos-10-15'
}
// Linux - default
return 'https://files.stork-search.net/releases/v1.4.2/stork-amazon-linux'
}
We can also pull the latest release assets, such as the stork.wasm
and stork-ubuntu-20-04
, from GitHub the same way youtube-dl-exec
does:
https://api.github.com/repos/jameslittle230/stork/releases?per_page=1
You can use optionalDependencies
where each one represents a different binary for a different platform.
See Next.js for an example of optionalDependencies
:
https://unpkg.com/browse/[email protected]/package.json
And look at one of the deps to see how os
and cpu
are used:
https://unpkg.com/browse/@next/[email protected]/package.json
(I had responded from an incorrect Github account earlier)
This is great - thanks, both, for outlining a path forward here. I think this will work as expected, and I'll start exploring this as soon as I can.
@jameslittle230 Here's an in-progress download script that's working on macOS and Netlify Ubuntu: https://github.com/ThatGuySam/doesitarm/blob/7318de945d754433add70896777b6a56b007c594/helpers/stork/executable.js
For some reason, it gives me trouble when I try to run the executable to build the index from Node so I ended up running it via the package.json
- A package that builds search indexes from the Node runtime. This package would be installed as part of a JS-based static site generator (Next.js, Gatsby, 11ty).
I’d like to add to that, the Node part is optional: just JS runtime. (I’m working on a project where it’d be desirable to let users preview built site from within the headless CMS that runs in browser. It has access to Web workers and can obviously load WASM, as browsers do, but can’t use Node’s environment.)
It seems natural to separate it into runtime-independent code that takes source data blobs (or a blob generator function) and returns the index blob, and a Node CLI that uses fs
et al. to read source files and write the resulting index somewhere on disk. Those who want it can use the former without the latter, and do something else with the index blob.
(Addendum: alternatively, it could require passing in fs
and other dependencies as arguments—this would allow callers to provide whatever suits their environment. There are precedents: this is what isomorphic-git does, for example.)