rerun Prototype Gradio Support

A few of the complexities to figure out with Gradio: (1) Gradio (and in particular hugging face spaces) run as a persistent service even across multiple users. This means we need to avoid, or at least be able to avoid global session state interfering across multiple user sessions. (2) In Gradio's normal execution model, jobs fully execute, then produce an output. This is somewhat similar to how our current notebook cells works with a single static resource output. Instead, we would like the embedded rerun viewer context to exist the whole time the job is running with data incrementally streaming to it. Gradio already has its own incremental streaming mechanism used for things like streaming audio or video, so tying into this framework would be nice. (3) HF spaces, including Gradio, do everything over a single port. This means we really don't want to depend on running our own websocket server. Somehow retrieving the data-stream back from Rerun and tunneling it through Gradios websocket, as done with the other data that Gradio transmits, would simplify integration. This is also somewhat related to user-isolation, since Gradio already manages user session context.

May 01 '23 18:05 jleibs

https://x.com/radamar/status/1785066677706928326 https://huggingface.co/spaces/radames/gradio_rerun

Apr 30 '24 06:04 emilk

I have an idea for a possible direction building on top of https://github.com/radames/gradio-rerun-viewer/tree/main that I think would play nicely with the Gradio streaming execution model: https://www.gradio.app/guides/reactive-interfaces

This requires building a little bit of extra functionality onto the python and javascript APIs:

A small extension to the MemoryRecording that lets us drain the current buffer and return it to python as an in-memory RRD (while keeping the recording_id the same).
One more javascript API, exposed via our npm package, that lets us directly inject a buffer of bytes instead of an rrd url. This would call through to something similar to stream_rrd_from_event_listener but can probably skip the message-handler logic since without the iframe we can call wasm bindings directly.

This lets us set up Gradio in streaming mode, and have a job incrementally yield rrd chunks which are sent over Gradios comms mechanism and then added to the viewer where they are accumulated and a single RRD.

Theoretical API might look something like:

def stream_rrd(some_input):
    with rr.buffered_stream("rerun_stream_example") as stream:
        for id,frame in enumerate(some_input):
            rr.set_time_sequence("frame", id)
            rr.log("input", rr.Image(frame))
            output = process(frame)
            rr.log("output", rr.Image(output))

            yield stream.flush()

with gr.Blocks() as demo:
    with gr.Row():    
        input = gr.SomeInput()
        btn = gr.Button("Run")
    with gr.Row():    
        viewer = gr.Rerun(streaming=True)

    gr.on([btn.click], fn=stream_rrd, inputs=[input], outputs=[viewer])

Apr 30 '24 13:04 jleibs

Reminder to self: give the gradio component a blueprint argument that does the right thing.

May 10 '24 14:05 jleibs

https://github.com/radames/gradio-rerun-viewer/pull/1

May 29 '24 20:05 jleibs

This has been published.

Jun 03 '24 23:06 jleibs