Add Audio Input Widget
Closing https://github.com/holoviz/panel/issues/4048.
For now this is exploration in order to be able to design the widget.
Related issues
- https://github.com/holoviz/panel/issues/7035
- https://github.com/holoviz/panel/issues/7090
- https://github.com/holoviz/panel/issues/7021 (We should show how to integrate audio input with chat)
- https://github.com/holoviz/panel/issues/4861
Design
Inspiration
- Gradio Real time speech recognition: https://www.gradio.app/guides/real-time-speech-recognition
- https://github.com/gradio-app/gradio/blob/main/gradio/components/audio.py
- https://github.com/gradio-app/gradio/blob/main/js/audio/recorder/AudioRecorder.svelte
- Conversion in the browser: https://stackoverflow.com/questions/57365486/converting-blob-webm-to-audio-file-wav-or-mp3
- Wave Surfer: https://wavesurfer.xyz/docs/ and recording example https://wavesurfer.xyz/examples/?record.js
- Streamlit Experimental Audio Recorder https://docs.streamlit.io/develop/api-reference/widgets/st.audio_input and https://github.com/streamlit/streamlit/tree/develop/frontend/lib/src/components/widgets/AudioInput.
- Streamlit Audio Recorder, https://github.com/stefanrmmr/streamlit-audio-recorder, Audio React Recorder: https://doppelgunner.github.io/audio-react-recorder/, https://github.com/doppelgunner/audio-react-recorder?tab=readme-ov-file
- Audio Recorder Streamlit: https://github.com/Joooohan/audio-recorder-streamlit
- Streamlit-audiorecorder: https://github.com/theevann/streamlit-audiorecorder and https://github.com/samhirtarif/react-audio-recorder
- https://github.com/whitphx/streamlit-webrtc
- OpenAI Realtime API https://platform.openai.com/docs/guides/realtime/overview
Questions
Design Decisions to be taken
- Do we want to focus on Audio input or combine Audio and Video? The Media Stream Api Supports both?
- [x] Audio.
- [ ] Video
- What should the name be
- [ ]
Microphone - [x]
AudioInput - [ ]
AudioRecorder
- [ ]
- Do we want to enable incremental streaming? Or just sending value when recording is finished?
- [x] Final Value
- [ ] Streaming value
- Do we want to support more value formats than default webm? mp3, ogg, wav etc. Converting on client side might require cross origin isolation. Converting on server side might require ffmpeg installation.
- [x] mp3, ogg, wav
- [x] Conversion on client side
- Do we want bare minimum UI (Start, Stop, Pause)? Or extra features:
- [x] submit button
- [x] playback button?
- [x] wave graph?
- [ ] editing possibilities?
- Do we want compact UI like Streamlit or large UI like Gradio?
- [x] Compact UI
- [ ] Large UI
- Do we want to build on raw Media Stream Recording API or library?
- [x] Wavesurfer https://wavesurfer.xyz/docs/ (What Gradio and Streamlit are using)
- [ ] React Audio Recorder https://github.com/samhirtarif/react-audio-recorder (looks really simple to implement. But uses React).
- Do we want to make it easy for users to
- [x] Play the value in the Audio pane?
- [x] work with the value as a Numpy Array?
- [x] work with the value as a data url?
- [x] Save the value to a file?
- How do we most efficiently transfer the media from client to server
- [x] Bytes (I don't yet know how to do this)
- [ ] data url (This is easily doable)
Items marked with [x] are the choices we should select to implement.
Codecov Report
Attention: Patch coverage is 0% with 39 lines in your changes missing coverage. Please review.
Project coverage is 82.14%. Comparing base (
5ef8909) to head (a0d9dce). Report is 258 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| panel/widgets/microphone.py | 0.00% | 39 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #7363 +/- ##
==========================================
- Coverage 82.21% 82.14% -0.08%
==========================================
Files 337 338 +1
Lines 50513 50552 +39
==========================================
- Hits 41530 41524 -6
- Misses 8983 9028 +45
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
If you are interested in audio input feel free to comment on questions above @philippjfr and @ahuang11 .
Not questions (yet), but is there a corresponding JS file?
No. Currently just exploring the design.