gradio icon indicating copy to clipboard operation
gradio copied to clipboard

[BUG Regression] Audio streaming in gradio 4.18.0+ does not work

Open pseudotensor opened this issue 1 year ago • 22 comments

Describe the bug

Audio streaming does not work in gradio 4.18.0 or 4.19.0 but works fine in gradio 3.50.2 and up to and including 4.17.0.

Have you searched existing issues? 🔎

  • [X] I have searched and found no existing issues

Reproduction

Literal example with only autoplay=True added so streaming occurs without clicking anything: https://www.gradio.app/guides/reactive-interfaces#streaming-components

Note that this has a typo format="bytes" but that has no effect.

Add some wave file, I added example.

audio.zip

  1. unzip zip or edit with your own audio
  2. run demo
  3. click on first example
  4. click stream as bytes

You'll see that streaming via bytes fails to work. The audio component is changed, but no audio appears and is not played.

import gradio as gr
from pydub import AudioSegment
from time import sleep

with gr.Blocks() as demo:
    input_audio = gr.Audio(label="Input Audio", type="filepath", format="mp3")
    with gr.Row():
        with gr.Column():
            stream_as_file_btn = gr.Button("Stream as File")
            format = gr.Radio(["wav", "mp3"], value="wav", label="Format")
            stream_as_file_output = gr.Audio(streaming=True)

            def stream_file(audio_file, format):
                audio = AudioSegment.from_file(audio_file)
                i = 0
                chunk_size = 1000
                while chunk_size * i < len(audio):
                    chunk = audio[chunk_size * i : chunk_size * (i + 1)]
                    i += 1
                    if chunk:
                        file = f"/tmp/{i}.{format}"
                        chunk.export(file, format=format)
                        yield file
                        sleep(0.5)

            stream_as_file_btn.click(
                stream_file, [input_audio, format], stream_as_file_output
            )

            gr.Examples(
                [["audio/speech.wav", "wav"], ["audio/speech.wav", "mp3"]],
                [input_audio, format],
                fn=stream_file,
                outputs=stream_as_file_output,
                cache_examples=True,
            )

        with gr.Column():
            stream_as_bytes_btn = gr.Button("Stream as Bytes")
            stream_as_bytes_output = gr.Audio(streaming=True, autoplay=True)

            def stream_bytes(audio_file):
                chunk_size = 20_000
                with open(audio_file, "rb") as f:
                    while True:
                        chunk = f.read(chunk_size)
                        if chunk:
                            yield chunk
                            sleep(1)
                        else:
                            break
            stream_as_bytes_btn.click(stream_bytes, input_audio, stream_as_bytes_output)

if __name__ == "__main__":
    demo.queue().launch()

Screenshot

gradio 4.16.0 or 4.17.0 that work:

image

gradio 4.18.0 where things stop working:

image

Logs

No response

System Info

many gradios tried.  gradio 4.18.0 and 4.19.0 and 4.19.1 all fail to work.

Severity

Blocking usage of gradio

pseudotensor avatar Feb 21 '24 01:02 pseudotensor

FYI @hannahblair and @abidlabs you made some changes to Audio for 4.18.0

pseudotensor avatar Feb 21 '24 02:02 pseudotensor

Thanks @pseudotensor we'll take a look. @hannahblair let's add an e2e test for this as we've seen regressions around audio streaming before

abidlabs avatar Feb 21 '24 02:02 abidlabs

Will take a look, thanks for raising this @pseudotensor

hannahblair avatar Feb 21 '24 10:02 hannahblair

Any luck? Still not working with gradio 4.20.1

pseudotensor avatar Mar 06 '24 20:03 pseudotensor

Hi @hannahblair Any luck?

pseudotensor avatar Mar 08 '24 10:03 pseudotensor

Update on this regression?

pseudotensor avatar Mar 10 '24 17:03 pseudotensor

Not yet but we’ll look into it this week cc @aliabid94

abidlabs avatar Mar 10 '24 18:03 abidlabs

Thanks!

pseudotensor avatar Mar 13 '24 18:03 pseudotensor

Experiencing the same issues here, regressing to 4.16.0 fixed it for me but 4.21.0 is broken.

0xbitches avatar Mar 14 '24 11:03 0xbitches

Also please make sure download of audio works, i.e. from UI clicking the audio element to download. I noticed that is broken too.

That is, in 4.16.0 or 3.50.2 streaming audio still works, and the Audio element is filled. But when clicking on download, it waits for long time and then downloads a 0 size text file, e.g. named [axy8zweof3_140268413143696_280.txt](http://0.0.0.0:7860/stream/axy8zweof3/140268413143696/280).

pseudotensor avatar Mar 14 '24 22:03 pseudotensor

Hi @aliabid94 and @hannahblair any luck here in fixing this regression?

pseudotensor avatar Mar 20 '24 01:03 pseudotensor

@pseudotensor I run into the same problem and after 3 days of tracking the code, this is a workaround for me

--- a/gradio/components/audio.py
+++ b/gradio/components/audio.py
@@ -302,6 +302,7 @@ class Audio(
                     )
                 else:
                     binary_data = binary_data[44:]
+            output_file["url"]=f"stream/{output_id}"
         return binary_data, output_file

@abidlabs I can submit a PR if this is a legit fix.

This only works on Edge but not Safari.

cschin avatar Mar 21 '24 16:03 cschin

Sorry the team has been heads down on some other PRs, but we're going to tackle this next. @cschin feel free to open a PR yes and we'll test it out

abidlabs avatar Mar 21 '24 18:03 abidlabs

That change didn't work for me for chrome or firefox. No effect.

pseudotensor avatar Mar 21 '24 20:03 pseudotensor

That change didn't work for me for chrome or firefox. No effect.

In my case, the audio element after 4.16 has not the url property populated for the

As far as I can tell, the change for streaming chatbot broken the process to get a url for the SEE binary audio streaming. There was a couple of other changes late changed how the URL was processed.

I am testing using 4.22.0 and Edge browser. Safari does not work due to it requested very specific response for the binary streaming output. The Edge browser seems to be tolerated about just raw data stream (with the .wav header in the first package). Also, I am using a Mac. When I have a chance, I can check how other browsers work.

Edit: just check with Chrome on my Mac, it works. @pseudotensor, can you check if the "url" property is properly filled in the html

cschin avatar Mar 21 '24 20:03 cschin

The UI element connected to the streaming of audio has no audio but was touched:

image

pseudotensor avatar Mar 21 '24 21:03 pseudotensor

@pseudotensor if you have some simplified test code, I can see if it my patch works.

Edit: this test case works for me (Edge, Chrome, not Safari) https://gist.github.com/cschin/a28f97b95823a35aff47cda761589673

cschin avatar Mar 22 '24 15:03 cschin

Hi, the repro is at the top of this issue.

pseudotensor avatar Mar 22 '24 19:03 pseudotensor

@pseudotensor the format="bytes" runs on a different code branch. I added the url to the output and it works on my test using the demo (main and [email protected])

cschin avatar Mar 23 '24 01:03 cschin

Hi, any progress on this issue?

pseudotensor avatar Mar 27 '24 15:03 pseudotensor

Picking this up now, sorry for the delay!

aliabid94 avatar Mar 28 '24 17:03 aliabid94

Thanks! Hopeful! :)

pseudotensor avatar Mar 28 '24 19:03 pseudotensor