screenpipe
screenpipe copied to clipboard
[bug] Repetitive Transcription During Silence or White Noise Periods
Repetitive Transcription During Silence or White Noise Periods
Description
Running Version 0.5.6 (20241017.030846)
Apple M3 Max MacBook Pro 14-inch, Nov 2023 Memory: 128 GB
macOS Sequoia Version 15.1 Beta (24B5077a)
The Screenpipe app is experiencing an issue where the transcription feature generates repetitive text during periods of silence or white noise. This problem affects the accuracy of meeting summaries and may be related to how the app processes audio input when no distinct speech is detected.
The repetitive text appears as:
"The world is a great place to be able to get the world to be able to get the world..." or other disjointed phrases.
This redundancy prevents the app from providing meaningful summaries and indicates a potential bug in the transcription process or AI settings.
Current Settings
- Audio Transcription Model: whisper-large-turbo
- OCR Model: apple native
- Monitors: 1 monitor(s) selected
-
Audio Devices: 2 device(s) selected
- MacBook Pro Microphone (input) (default)
- Display 1 (output) (default)
- Languages: english
- Restart Interval: 0 minutes (experimental feature)
Steps to Reproduce
- Open the Screenpipe app with the current settings.
- Start a recording where there are expected periods of silence or white noise.
- Check the transcription output in the "meeting and conversation history" section.
Expected Behavior
- The transcription should accurately reflect spoken content without inserting repetitive phrases during silent or white noise periods.
- Summaries should be concise and relevant to the actual meeting content.
Actual Behavior
- The transcription fills with repeated phrases like "the world is a great place to be able to get the world..." during silences or white noise.
- This repetition disrupts the generation of meaningful summaries.
Suggested Fixes
- Investigate and adjust the audio processing algorithm, particularly for the whisper-large-turbo model, to handle silence and white noise more effectively.
- Implement filters to prevent repetitive text from affecting transcription outputs.
- Consider adding a silence detection feature that skips transcription during prolonged quiet periods.
- Explore options to fine-tune the AI model to better distinguish between speech and non-speech audio.
Additional Notes
- The issue occurs despite using the whisper-large-turbo model, which is generally considered high-quality for transcription tasks.
- The problem may be exacerbated by the experimental nature of some features, such as the restart interval setting.
Attachments
- Screenshot of the affected transcript (attached)
- Screenshot of current Screenpipe settings (attached)
Please address this issue to improve transcription accuracy and summary reliability in meetings with varying audio conditions.