Handling of url data in various display clients
Any spec that uses a URL data source is currently not working in a) the default browser display, b) ElectronDisplay.jl and c) the plot pane in VS Code. Things work fine when such a spec is saved to disc with the save function.
The reason this doesn't work in the three mentioned clients is that all of them use vega.min.js (a local copy of it that ships with the client) to load vega, but that version of vega doesn't include the loader code that can get stuff from disc. The save function on the other hand installs a copy of vega via npm, and that version includes the local disc loading code.
Generally speaking I would much prefer to use vega.min.js everywhere as the deployment vehicle for vega because it means we can just ship a copy inside the julia package, and don't have to run npm on client systems to install vega, which can always go wrong. Also, for ElectronDisplay.jl and VS Code we need to get multiple vega and vega-lite versions onto the client systems, because these displays need to support all the existing vega and vega-lite MIME types.
My understanding is that the npm version of the vega-loader package uses index.js (which includes support for loading from disc), whereas the rollup version of vega-loader that ships as part of vega.min.js uses index.browser.js (which does not support loading from disc).
Ideally (I think) we would somehow be able to get our hands at a vega-with-disc-loading.min.js that is essentially equivalent to the existing vega.min.js, but supports loading from disc. That way we could keep our current simple and robust deployment story, but still support loading from disc. I see two options for this: 1) maybe we can convince the vega folks to just build such a file as part of the normal release process, or 2) we figure out how to create such a file ourselves and then ship that in our packages. I would much prefer 1) for obvious reasons :) One of those reasons is that I'm entirely unfamiliar with the whole Javascript rollup ecosystem, and so doing this ourselves looks daunting to me.
One caveat with this approach is that I'm not sure whether index.js would actually work in all the clients we have, because it uses require. I think it should probably work in ElectronDisplay.jl and VS Code because both of these are based on Electron, which I believe uses Node under the hood, and so I assume require is available there (but I'm not actually super sure about that). For the default browser display in VegaLite.jl this might not work: in that client we just write a HTML file to disc that displays a spec and then open that in a browser, which I assume doesn't support require. Maybe there is some way to work around that with RequireJS? Again, I'm out of my comfort zone with that JavaScript stuff here :)
I'm CCing @jheer, @domoritz and @mcmcgrath13 in case they have thoughts, comments or other ideas how we might be able to solve this.
The bundles we build are for browser targets and therefore do not include the code to load from disk. In a node environment (which right now is almost every non-browser environment), there is no need for minification and you can just import the vega package. I don't think it's necessary to build a bundle that works in both browsers and node.
@davidanthoff would you be opposed to trying to replace that functionality on the julia side? We could read in the files and pass them to the values field instead of the url field. It's a little bit heavy handed, but it is guaranteed to work in all environments as julia is a backend language and always has access to the file system. The browser target will likely be a tricky issue with loading from disc.
The best approach (that has proven to work really well in other environments) is to separate the spec from the data and send the data separately (as a binary arrow table for example). Then in the browser, you can use the Vega view API to inject the data. This way you don't have to serialize the data as JSON/CSV and also handle updates of the data independent from changes to the spec.
@domoritz could you point me to an example/package/project I could reference?
Not yet but maybe in a month. Do you have specific questions I can answer now?
I think there are two issues here that are quite independent:
- if we have the data in the julia process already, we could be more efficient by not embedding it in the vega-lite spec, but instead using something like what @domoritz suggests. I agree, but I think that is an orthogonal issue to what I raised above :) Having said that, right now we are waiting for the native julia arrow implementation to be finished, I think we can tackle this once that is done.
- what do we do if a user creates a spec with a file path as the data source? @mcmcgrath13's suggestion would work, i.e. essentially that amounts to loading it into the julia process and then handling it in a similar way in which we handle all data that is already loaded in julia, namely embedding it into the spec (or in a future version, maybe using arrow a la @domoritz's suggestion).
While that would work, I still kind of feel that if a user actually creates a spec with a file path, then ideally we should respect that and make things work for that scenario in such a way that the vega data loader kicks in, assuming that is what the user wanted.
In a node environment (which right now is almost every non-browser environment), there is no need for minification and you can just import the vega package.
The problem for us is deployment. It is much easier to deploy multiple versions of vega and vega-lite if each is just a single file. I also want to avoid running npm on client machines to install the vega and vega-lite node packages, because we had lots of problems with that on various user machines. It is also unclear to me how I can for example load a set of node packages into the electron process that powers say the VS Code web view, whereas that is super easy with a minified version.
@domoritz what is currently actually the command to create the minified version of vega? I'd at least would want to give it a shot to create minified versions of vega that are identical to the version that ships as a npm package. Even if the vega team doesn't want to host them, we could potentially just build these minified versions oursevles and ship them in our clients, and all would be good.
The main difference in the package that we are building by default and the bundle that you want that has access to the filesystem is the vega loader. As you can see in https://github.com/vega/vega/blob/master/packages/vega-loader/package.json#L43, we are creating different bundles for the browser and rollup picks that one up in https://github.com/vega/vega/blob/master/packages/vega/rollup.js#L50. We are actually already generating such a bundle with https://github.com/vega/vega/blob/master/packages/vega/rollup-node.js. The issue with the existing vega-node.js bundle is that it uses require for dependencies (because of https://github.com/vega/vega/blob/master/packages/vega/rollup-node.js#L8). We are not compiling a bundle that has all dependencies in it for node but you could write your own rollup configuration to do that (let me know if you need help with that).
Alternatively, you could create your own bundle that contains Vega, Vega-Lite, Vega Embed and all your other code but also uses the file API (you would also get support for rendering to PDFs but let's stay on topic). I think that would be even better since you already need to make a bundle yourself and you might as well bundle everything then. Again, let me know if you need help with that. I recommend rollup to bundle your code.
Ok, I spent some time on this last week, but didn't get very far. I think the first problem I ran into was that the browser flag here affects both the inclusion of the file system loader in vega-loader and which version of of canvas is used by vega-canvas here. I think the problem I had at the end was that I want vega-loader to build with browser: false, but vega-canvas with browser: true, and I don't know how to do that in one go.
My starting point for all of this was that I just took https://github.com/vega/vega/blob/master/packages/vega/rollup.js and then modified browser: true to be browser: false. It is not clear whether I would have to change anything else in principle, so that more node packages get included in the bundle that might be used by the file loader stuff in vega-loader?
Let me include @jheer in this, who created the Vega bundles. Maybe we should just include a node bundle that has all dependencies in it and does not depend on node_modules.
@jheer, would creating such a bundle that includes everything (i.e. including the file system loader) and uses the browser canvas be an option?