gosling.js icon indicating copy to clipboard operation
gosling.js copied to clipboard

feat: add entry point for using experimental plugin data fetchers

Open sehilyi opened this issue 2 years ago • 3 comments

The motivation for this PR is that we want to enable users to implement their own data fetchers and use that in the grammar. For example, my use case in a Cistrome project is to directly use a Cistrome API in a data fetcher to load their data. Since this is a quite specific use case, I would want to implement the data fetcher outside Gosling.

This PR basically adds a new type of data (experimentalPlugin) in the Gosling schema to enable using externally implemented data fetchers in Gosling.

export interface PluginData {
    type: 'experimentalPlugin';

    /** The name of the plugin data that should match the `type` of a plugin data fetcher */
    name: string; // ← this is used as a `data.type` in a compiled HiGlass track viewConfig

    /** All custom options exposed to a plugin data fetcher */
    options: Record<string, any>; // ← a data fetcher can use these options (e.g., `this.dataConfig.cid`)
}

Actual usage example:

// register the custom data-fetcher
hgRegister(
  { dataFetcher: CistromeDataFetcher, config: CistromeDataFetcher.config },
  { pluginType: "dataFetcher" }
);

// use the plugin data
function App() {
  return (
    <GoslingComponent spec={{
      tracks: [{
        data: {
          type: 'experimentalPlugin',
          name: 'cistrome',
          options: { cid: 1 } // custom options used in the plugin data fetcher
        },
        ...
      }]
    }}/>
  );
}

Example repo: https://github.com/sehilyi/gosling-plugin-datafetcher-example/blob/master/src/App.js

sehilyi avatar Jul 07 '22 17:07 sehilyi

I think I understand the motivation behind, but I think we need to think carefully about adding "plugins" or "extensions" to Gosling.

Currently the Gosling Grammar is fully self-contained, so a visualization created in one context (e.g, via Gos) can exported and rendered in a complete different context (e.g., The Online Editor). Plugin features like this (or templates) break that compatibility guarantee and introduce dependencies that can be difficult to manage.

Part of Gosling is to provide a more structured framework to visualize genomics datasets (in comparison to something like HiGlass which gives plugin developer full control over data-fetching and rendering). Additionally, Gosling currently hides the fact that it is built on top of HiGlass (which has a massive API surface area).

Whatever plugin approach we choose, I would recommend hiding the fact that HiGlass is involved (or substantially limiting how much of HiGlass users have access to). Currently users can only modify the specification (and top level properties) to change rendering behavior, but allowing users to register any HiGlass data-fetcher means that our users will now rely on HiGlass as a dependency.

For example, if we every wanted to fork a more minimal version of HiGlass for Gosling (e.g., just a few tracks/data-fetchers which we need) we couldn't do this since users might rely on different pieces of the library from us that we can't know.

manzt avatar Jul 08 '22 15:07 manzt

To start, I would recommend defining the minimal interface for a Gosling Data Plugin:

type DataPlugin<DataConfig> = (config: DataConfig) => {
  // we can use promise-based APIs rather than callbacks and convert internally
  tile(...): Promise<Tile>;
  tilesetInfo(...): Promise<TilesetInfo>; 
}

We can then have our own plugin registry (which wraps HiGlass's) but is not exposed to the user. This also allows us to implement additional methods which are repeated:

// internal
interface HiGlassDataFetcher<DataConfig> {
  constructor(config: DataConfig);
  tile(tileId: string, cb: (tile: Tile) => void): void;
  tilesetInfo(cb: (info: tilesetInfo) => void): void;
  fetchTilesDebounced(tileIds: string[], cb: (tile: Record<string, Tile>) => void): void;
}

function adapt<DataConfig>(dataPlugin: DataPlugin<DataConfig>): HiGlassDataFetcher<DataConfig> {
  // convert the minimial API to something that is compatible with HiGlass
}

export function register<Config>(name: string, plugin: DataPlugin<Config>) {
  hgRegister(adapt(plugin), { pluginType: "dataFetcher" });
} 
const cistrome: DataPlugin<CistromeConfig> = (config) => {
  return {
    tile: /* ... */,
    tilesetInfo: /* ... */,
  } 
}

import { GoslingComponent, register } from 'gosling.js';

register('cistrome', cistrome);

// use GoslingComponent;

See how Vega did this for apache arrow: https://github.com/vega/vega-loader-arrow

manzt avatar Jul 08 '22 16:07 manzt

I see that the current approach is somewhat problematic since it breaks the consistency between components. I am somewhat reluctant to support this unless we figure out a way to seamlessly support plugin data fetchers in all environments, like in js, py, r, and editor (or at least the first three).

I think one potential use case we can consider (which is slightly different than what I initially proposed) is enabling to implement data-fetchers (or also new mark type) outside the repo that are intended to be merged to the gosling.js eventually. I think being able to work outside the large repo makes it easier for contributors to work on implementing new features for Gosling.

I like the idea of hiding HiGlass-specific things and also simplifying codes that the contributors have to implement! I think we can gradually support this as we gradually work of data fetchers and get a better sense on its type definitions.

sehilyi avatar Jul 29 '22 18:07 sehilyi