omi icon indicating copy to clipboard operation
omi copied to clipboard

Soniox hallucinated transcript handling

Open mdmohsin7 opened this issue 8 months ago • 0 comments

This PR introduces hallucination detection for Soniox streaming by comparing finalized segments against recent history using fuzzy string matching (fuzz.token_set_ratio). It suppresses segments if they are highly similar (SIMILARITY_THRESHOLD) to the last 1 or 2 segments for a defined number of times (REPEAT_COUNT_THRESHOLD). A maximum segment duration timeout (MAX_SEGMENT_DURATION_S) ensures the check runs even during long, continuous loops.

We cannot rely on the confidence field returned by soniox because it is >0.9 even for hallucinated transcript

Screenshot 2025-04-15 at 12 20 06 AM Screenshot 2025-04-14 at 8 59 15 PM Screenshot 2025-04-14 at 8 49 15 PM

mdmohsin7 avatar Apr 14 '25 18:04 mdmohsin7