panel Add Audio Input Widget

Closing https://github.com/holoviz/panel/issues/4048.

For now this is exploration in order to be able to design the widget.

Related issues

https://github.com/holoviz/panel/issues/7035
https://github.com/holoviz/panel/issues/7090
https://github.com/holoviz/panel/issues/7021 (We should show how to integrate audio input with chat)
https://github.com/holoviz/panel/issues/4861

Design

Inspiration

Gradio Real time speech recognition: https://www.gradio.app/guides/real-time-speech-recognition
- https://github.com/gradio-app/gradio/blob/main/gradio/components/audio.py
- https://github.com/gradio-app/gradio/blob/main/js/audio/recorder/AudioRecorder.svelte
Conversion in the browser: https://stackoverflow.com/questions/57365486/converting-blob-webm-to-audio-file-wav-or-mp3
Wave Surfer: https://wavesurfer.xyz/docs/ and recording example https://wavesurfer.xyz/examples/?record.js
Streamlit Experimental Audio Recorder https://docs.streamlit.io/develop/api-reference/widgets/st.audio_input and https://github.com/streamlit/streamlit/tree/develop/frontend/lib/src/components/widgets/AudioInput.
Streamlit Audio Recorder, https://github.com/stefanrmmr/streamlit-audio-recorder, Audio React Recorder: https://doppelgunner.github.io/audio-react-recorder/, https://github.com/doppelgunner/audio-react-recorder?tab=readme-ov-file
Audio Recorder Streamlit: https://github.com/Joooohan/audio-recorder-streamlit
Streamlit-audiorecorder: https://github.com/theevann/streamlit-audiorecorder and https://github.com/samhirtarif/react-audio-recorder
https://github.com/whitphx/streamlit-webrtc
OpenAI Realtime API https://platform.openai.com/docs/guides/realtime/overview

Questions

Design Decisions to be taken

Do we want to focus on Audio input or combine Audio and Video? The Media Stream Api Supports both?
- [x] Audio.
- [ ] Video
What should the name be
- [ ] Microphone
- [x] AudioInput
- [ ] AudioRecorder
Do we want to enable incremental streaming? Or just sending value when recording is finished?
- [x] Final Value
- [ ] Streaming value
Do we want to support more value formats than default webm? mp3, ogg, wav etc. Converting on client side might require cross origin isolation. Converting on server side might require ffmpeg installation.
- [x] mp3, ogg, wav
- [x] Conversion on client side
Do we want bare minimum UI (Start, Stop, Pause)? Or extra features:
- [x] submit button
- [x] playback button?
- [x] wave graph?
- [ ] editing possibilities?
Do we want compact UI like Streamlit or large UI like Gradio?
- [x] Compact UI
- [ ] Large UI
Do we want to build on raw Media Stream Recording API or library?
- [x] Wavesurfer https://wavesurfer.xyz/docs/ (What Gradio and Streamlit are using)
- [ ] React Audio Recorder https://github.com/samhirtarif/react-audio-recorder (looks really simple to implement. But uses React).
Do we want to make it easy for users to
- [x] Play the value in the Audio pane?
- [x] work with the value as a Numpy Array?
- [x] work with the value as a data url?
- [x] Save the value to a file?
How do we most efficiently transfer the media from client to server
- [x] Bytes (I don't yet know how to do this)
- [ ] data url (This is easily doable)

Items marked with [x] are the choices we should select to implement.

Oct 06 '24 07:10 MarcSkovMadsen

Codecov Report

Attention: Patch coverage is 0% with 39 lines in your changes missing coverage. Please review.

Project coverage is 82.14%. Comparing base (5ef8909) to head (a0d9dce). Report is 258 commits behind head on main.

Files with missing lines	Patch %	Lines
panel/widgets/microphone.py	0.00%	39 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7363      +/-   ##
==========================================
- Coverage   82.21%   82.14%   -0.08%     
==========================================
  Files         337      338       +1     
  Lines       50513    50552      +39     
==========================================
- Hits        41530    41524       -6     
- Misses       8983     9028      +45

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
:package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Oct 06 '24 08:10 codecov[bot]

If you are interested in audio input feel free to comment on questions above @philippjfr and @ahuang11 .

Oct 06 '24 08:10 MarcSkovMadsen

Not questions (yet), but is there a corresponding JS file?

Oct 15 '24 12:10 philippjfr

No. Currently just exploring the design.

Oct 15 '24 17:10 MarcSkovMadsen