Maysee comments

Results 43 comments of


                                            Maysee

[EDGE CASE] Cartoon voice worse performance on v5 than older version

Is there a workaround for this issue? I'm experiencing the same degradation in Japanese whispering audio.

lack of module ’google.generativeai‘ when use cpu version of torch suites

The google.generativeai package should have been removed, in version 1.6.1 the 13-th line of chatbot.py is: https://github.com/zh-plus/openlrc/blob/b78547dfd6688bd1ca03f74f04b7626abf076fa2/openlrc/chatbot.py#L13 , having conflict with your error message: > File "/home/user00/gitspace/video_tools/.venv/lib/python3.10/site-packages/openlrc/chatbot.py", line 13, in...

Use Silero VAD in Batched Mode

Is it better to let users choose the VAD model from pyannote VAD or Silero VAD? I get better VAD segments for Chinese & Japanese audios with pyannote than Silero,...

loose coupling transcription and translation steps

Thank you for this detailed analysis. I do think the structure of openlrc.py is not good - the tight coupling between transcription and translation makes the code less maintainable and...

Inconsistent definition of `clip_timestamps` parameter between `WhisperModel` and `BatchedInferencePipeline`

I think unifying the interfaces would be better than documenting differences. We could change both to Optional[List[dict]] where each dict has start/end times. This would make it possible to use...

Improve error handling: No active speech found in audio

There is an existing PR for Faster-Whisper to implement early stopping for non-voice audio, which can be found at https://github.com/SYSTRAN/faster-whisper/pull/1014. Until it's merged, there seems to be no straightforward solution...

Improve error handling: No active speech found in audio

It should be fixed with the latest version of Faster-Whisper in v1.6.0. Please reopen it if the issue persists.

Improve error handling: No active speech found in audio

Thanks! I've updated this dependency.

无法使用 openlrc gui 启动：No such file or directory: 'streamlit'

This streamlit GUI is not supported now. I'm planing to release a new GUI next month.

IMPORTANT: 1.0.3 VAD v5 is much worse than 1.0.2 or 1.0.1 VAD v4 for some certain audio data. WHY?

You must carefully turn the parameters for Silero VAD-v5 according to your audio. Related issue: #925, #934