Leo Huang
Leo Huang
@ggerganov please help, I did extactly same thing as what @Dmitriuso did, yt-dlp -xv --audio-format wav -o skillsfuture.wav https://www.youtube.com/watch?v=girQacfWjMw&list=PLH2CR4s1lqyjFm4vQPKT0-hE8sh2T10I1 ffmpeg -i skillsfuture.wav -acodec pcm_s16le -ar 16000 sf.wav ./main -m ../whisper-models/ggml-base.en.bin...
Is that possible, we integrate ECAPA-TDNN model from [SpeechBrain](https://github.com/speechbrain/speechbrain) into this project, like what following project have done? https://huggingface.co/spaces/vumichien/Whisper_speaker_diarization Tested with this video, https://www.youtube.com/watch?v=girQacfWjMw&list=PLH2CR4s1lqyjFm4vQPKT0-hE8sh2T10I1 works pretty well. But it is...
Thanks @bjnortier for quick reply. I previously used the code from commit: 09e90680072d8ecdf02eaf21c393218385d2c616 It works perfectly on same iPhone device. Does this means there is much more memory usage since...
"When you load a CoreML model it is optimised on the device" - is model optimized saved to local storage, or it is in memory? If answer is the latter...
> Hello, the download failed due to the disconnection of the network connection in the process of downloading audio data. How can I continue to download from the disconnection point?...
I'm also interested in this topic. any update?
- ... it might change in the future Does this mean pyannote/embedding will be optimized? or there will be better model than speechbrain/spkrec-ecapa-voxceleb? - ... you need to optimize this...