[bug] Repetitive Transcription During Silence or White Noise Periods

Open cyberpapiii opened this issue 4 months ago • 2 comments

Repetitive Transcription During Silence or White Noise Periods

Description

Running Version 0.5.6 (20241017.030846)

Apple M3 Max MacBook Pro 14-inch, Nov 2023 Memory: 128 GB

macOS Sequoia Version 15.1 Beta (24B5077a)

The Screenpipe app is experiencing an issue where the transcription feature generates repetitive text during periods of silence or white noise. This problem affects the accuracy of meeting summaries and may be related to how the app processes audio input when no distinct speech is detected.

The repetitive text appears as:

"The world is a great place to be able to get the world to be able to get the world..." or other disjointed phrases.

This redundancy prevents the app from providing meaningful summaries and indicates a potential bug in the transcription process or AI settings.

CX 10-17-2024 @ 05 15 26PM

Current Settings

Audio Transcription Model: whisper-large-turbo
OCR Model: apple native
Monitors: 1 monitor(s) selected
Audio Devices: 2 device(s) selected
- MacBook Pro Microphone (input) (default)
- Display 1 (output) (default)
Languages: english
Restart Interval: 0 minutes (experimental feature)

CX 10-17-2024 @ 05 14 38PM

Steps to Reproduce

Open the Screenpipe app with the current settings.
Start a recording where there are expected periods of silence or white noise.
Check the transcription output in the "meeting and conversation history" section.

Expected Behavior

The transcription should accurately reflect spoken content without inserting repetitive phrases during silent or white noise periods.
Summaries should be concise and relevant to the actual meeting content.

Actual Behavior

The transcription fills with repeated phrases like "the world is a great place to be able to get the world..." during silences or white noise.
This repetition disrupts the generation of meaningful summaries.

Suggested Fixes

Investigate and adjust the audio processing algorithm, particularly for the whisper-large-turbo model, to handle silence and white noise more effectively.
Implement filters to prevent repetitive text from affecting transcription outputs.
Consider adding a silence detection feature that skips transcription during prolonged quiet periods.
Explore options to fine-tune the AI model to better distinguish between speech and non-speech audio.

Additional Notes

The issue occurs despite using the whisper-large-turbo model, which is generally considered high-quality for transcription tasks.
The problem may be exacerbated by the experimental nature of some features, such as the restart interval setting.

Attachments

Screenshot of the affected transcript (attached)
Screenshot of current Screenpipe settings (attached)

Please address this issue to improve transcription accuracy and summary reliability in meetings with varying audio conditions.

Oct 17 '24 21:10 cyberpapiii

screenpipe screenpipe copied to clipboard

[bug] Repetitive Transcription During Silence or White Noise Periods

Repetitive Transcription During Silence or White Noise Periods

Description

Current Settings

Steps to Reproduce

Expected Behavior

Actual Behavior

Suggested Fixes

Additional Notes

Attachments

screenpipe
screenpipe copied to clipboard