grist-core icon indicating copy to clipboard operation
grist-core copied to clipboard

Python console ?

Open yohanboniface opened this issue 3 years ago • 3 comments

Hi :)

While working on a document, I often feel the need to have a more comfortable place to inspect the data through python, for example while analyzing it with a data producer, I sometimes need to quickly check if there are some duplicates in a column, or if some values in column A are also on column B, or to count "invalid rows" or rows matching a pattern, etc., without needing to store the result.

How hard would it be to have a sort of python console inside a document ? I'm imagining it maybe as a widget (based on Jupyter ?).

Yohan, dreaming out loud :)

yohanboniface avatar Sep 02 '22 10:09 yohanboniface

In an early pre-release version we actually used to have a REPL, which ran in the context of the sandbox, and allowed sending Python statements and have them execute. We dropped that idea because it had neither a convenient UI nor any good ways to work with the document on the Python side.

Jupyter notebook interface sounds much better. This is something I tried once, and it felt useful, but it involved running things locally, to have a backend (to actually run/serve Jupyter) and a way to connect up Jupyter to get data from Grist. The way it was structured wasn't promising enough to share anything. But I am sure a better approach is possible.

dsagal avatar Sep 02 '22 18:09 dsagal

Just a live example to illustrate: I need to compare the headers of two tables (which have more than 50), to see how they differ. :)

Did you already look at replit ? https://replit.com/@replit/Python?v=1

Or maybe a first step would be to expose the sandboxed python through the widget API ? I guess that would be very limited (no autocomplete, no state…), but that may be a way to have a custom widget with a texarea where to paste a python script ?

yohanboniface avatar Sep 07 '22 16:09 yohanboniface

Do you think with a custom widget it would be possible ? Having a custom widget with Jupyter or Replit and inject a grist object would be amazing ! It is what you would expect @yohanboniface ?

LouisDelbosc avatar Sep 15 '22 09:09 LouisDelbosc

I think a custom widget running JupyterLite (no server required!) would work well.

alexmojaki avatar Oct 03 '22 20:10 alexmojaki

Agree that a JupyterLite solution looks very doable. The only core app work needed in that case would be to put a button somewhere to bring up a preconfigured custom widget. @dsagal I think you had also mentioned your previous experiment on integrating a (server based) Jupyter notebook to @vviers. Do you remember how you represented or connected to Grist within Jupyter? Just wondering if any of that could be reused in JupyterLite.

paulfitz avatar Jul 25 '23 16:07 paulfitz

I dug up the old experiment, and I see that it used the "backend plugin" functionality that was never polished and has since been dropped entirely.

What it allowed was to start a node process on server side, running code provided by the plugin, and it provided RPC piping to allow communication from custom widget code. In my experiment (which I only ever tried to run locally), the custom widget, on startup, used RPC to call a method in this backend process, which spawned a server-side jupyter process, and returned its URL, which the custom widget then loaded in an iframe. Additionally, this server-side process exposed a couple of endpoints for fetching data, so that pandas.read_json(...) would return a ready-to-use data frame.

This was the relevant part of my experiment:

  • Endpoints (in a Node process, using Express):
    app.route('/tables')
      .get(expressWrap(() => gristDocAPI.listTables()));
    app.route('/tables/:tableId')
      .get(expressWrap(async (req) => gristDocAPI.fetchTable(req.params.tableId)));
    
    function expressWrap(callback: (req: express.Request, res: express.Response) => any): express.RequestHandler {
      return async (req, res, next) => {
        try {
          res.json(await callback(req, res));
        } catch (err) {
          next(err);
        }
      };
    }
    
  • Python library
    import os
    import pandas
    GRIST_URL = os.getenv("GRIST_API_SERVER_URL")
    
    def list_tables():
    """Fetches and returns a list of table ids in the current document."""
    return pandas.read_json(GRIST_URL + "/tables")[0].tolist()
    
    def fetch_table(table_id):
    """Returns the given table's data from the current document, as a pandas data frame."""
    return pandas.read_json(GRIST_URL + "/tables/" + table_id)
    

Not much to it...

With JupiterLite, I imagine the endpoint isn't needed at all, since Grist now gives custom widgets access to the REST API using getAccessToken.

dsagal avatar Jul 27 '23 08:07 dsagal

We now have a widget ready to try: https://github.com/gristlabs/jupyterlite-widget/blob/main/USAGE.md

alexmojaki avatar Oct 24 '23 21:10 alexmojaki