gradio icon indicating copy to clipboard operation
gradio copied to clipboard

Lite: pseudo-HTTP req/res to the Lite server is much slower than the read HTTP access

Open whitphx opened this issue 1 year ago • 1 comments

Sample:

import gradio as gr

demo = gr.Interface(
    fn=lambda x: x,
    inputs=gr.Image(type="pil", streaming=True),
    outputs="image",
    live=True,
)

demo.launch()

With the same sample code above, the image frames coming back from the server is obviously delayed in Lite much more than the normal Gradio.

When the image is resized to be smaller, the streaming becomes faster. This fact and the logs in the console implies the responses sent over the pseudo-HTTP connection on Lite are not fast enough.

import gradio as gr

def fn(img):
    img.thumbnail((100, 100))
    return img

demo = gr.Interface(
    fn=fn,
    inputs=gr.Image(type="pil", streaming=True),
    outputs="image",
    live=True,
)

demo.launch()

whitphx avatar Aug 13 '24 08:08 whitphx

With rough print-debug, I found

  • Image.postprocess took ~0.8s.
    • This become faster if the image is smaller (e.g. img.thumbnail((100, 100))), but anyway this seems not a problem but an inevitable tradeoff.
  • The time between the process_completed event dispatched from the server and the rerendering of Image.svelte is ~0.5s when uploading is enabled, while it's ~0.01s when uploading is turned off and rendering the remaining frames. This behavior is the reason I thought this was about the pseudo-HTTP access, but we should take a closer look into this as rendering itself is not directly related to the pseuto-HTTP access (because of busy processing?).

whitphx avatar Aug 14 '24 16:08 whitphx

Lite is no longer maintained but someone can fork it and contribute a fix for this.

Last release of lite: https://github.com/gradio-app/gradio-lite

freddyaboulton avatar Sep 18 '25 14:09 freddyaboulton