FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

Support Whisper-v3-large-turbo

Open MonolithFoundation opened this issue 1 year ago • 19 comments

Support Whisper-v3-large-turbo

MonolithFoundation avatar Oct 10 '24 08:10 MonolithFoundation

Please update funasr-1.1.12:

https://github.com/modelscope/FunASR/tree/main/examples/industrial_data_pretraining/whisper

LauraGPT avatar Oct 11 '24 06:10 LauraGPT

thanks for the quick response!

MonolithFoundation avatar Oct 11 '24 07:10 MonolithFoundation

s exceeded with url: /funasr/ (Caused by SSLError(SSLError(1, '[SSL] record layer failure (_ssl.c:1006)'))) - skipping ERROR: Could not find a version that satisfies the requirement funasr==1.1.12 (from versions: 0.3.1, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.6, 0.4.7, 0.4.8, 0.5.0, 0.5.1, 0.5.2, 0.5.3, 0.5.4, 0.5.5, 0.5.6, 0.5.8, 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.6.4, 0.6.5, 0.6.6, 0.6.7, 0.6.9, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.7.5, 0.7.6, 0.7.7, 0.7.8, 0.7.9, 0.8.0, 0.8.1, 0.8.2, 0.8.3, 0.8.4, 0.8.6, 0.8.7, 0.8.8, 1.0.0, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.0.11, 1.0.12, 1.0.14, 1.0.15, 1.0.16, 1.0.17, 1.0.18, 1.0.19, 1.0.20, 1.0.21, 1.0.22, 1.0.23, 1.0.24, 1.0.25, 1.0.26, 1.0.27, 1.0.28, 1.0.29, 1.0.30, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.8, 1.1.9, 1.1.11) ERROR: No matching distribution found for funasr==1.1.12

MonolithFoundation avatar Oct 11 '24 07:10 MonolithFoundation

image

MonolithFoundation avatar Oct 11 '24 07:10 MonolithFoundation

after Installed from git:

  • Authentication token does not exist, failed to access model Whisper-large-v3-turbo which may not exist or may be private. Please login first.

MonolithFoundation avatar Oct 11 '24 07:10 MonolithFoundation

Please update funasr again and re-try it: https://github.com/modelscope/FunASR/commit/cd684580991661b9a088361bea2d7f00735178d3

LauraGPT avatar Oct 11 '24 08:10 LauraGPT

after Installed from git:

  • Authentication token does not exist, failed to access model Whisper-large-v3-turbo which may not exist or may be private. Please login first.
modelscope login --token YOUR_MODELSCOPE_SDK_TOKEN

You can get the SDK token on Home page, https://modelscope.cn/my/myaccesstoken.

slin000111 avatar Oct 11 '24 09:10 slin000111

Hi, how to deal with this error anyway:

File "/tests/test_speakersep.py", line 97, in get_asr_spk res = self.model.generate( ^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 303, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 553, in inference_with_vad sv_output = postprocess(all_segments, None, labels, spk_embedding.cpu()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "FunASR/funasr/models/campplus/utils.py", line 117, in postprocess assert len(segments) == len(labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError 0%|

MonolithFoundation avatar Oct 11 '24 09:10 MonolithFoundation

Hello, anyone would like help this out? Currently the WhipserTurbo is not stable at alll

MonolithFoundation avatar Oct 12 '24 03:10 MonolithFoundation

just use it follow demos, any other usages are not supported now: https://github.com/modelscope/FunASR/tree/main/examples/industrial_data_pretraining/whisper

LauraGPT avatar Oct 12 '24 03:10 LauraGPT

The labels and segments not equal should because of this? vad_kwargs={"max_single_segment_time": 30000},

MonolithFoundation avatar Oct 12 '24 03:10 MonolithFoundation

Why doesn't support speaker for whisper?

MonolithFoundation avatar Oct 12 '24 03:10 MonolithFoundation

Hi, how to deal with this error anyway:

File "/tests/test_speakersep.py", line 97, in get_asr_spk res = self.model.generate( ^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 303, in generate return self.inference_with_vad(input, input_len=input_len, **cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/FunASR/funasr/auto/auto_model.py", line 553, in inference_with_vad sv_output = postprocess(all_segments, None, labels, spk_embedding.cpu()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "FunASR/funasr/models/campplus/utils.py", line 117, in postprocess assert len(segments) == len(labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError 0%|

same for me

TurboMa avatar Oct 12 '24 06:10 TurboMa

Why doesn't support speaker for whisper?

Whisper models lack timestamps for speaker recognition.

LauraGPT avatar Oct 12 '24 06:10 LauraGPT

Why doesn't support speaker for whisper?

Whisper models lack timestamps for speaker recognition. 截屏2024-10-12 14 12 36

the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.

TurboMa avatar Oct 12 '24 06:10 TurboMa

Why doesn't support speaker for whisper?

Whisper models lack timestamps for speaker recognition.

截屏2024-10-12 14 12 36 the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.

The timesptamp of whisper is sentence-level. However, the timestamp of speaker recognition should be word-level. If you are interest in that, maybe you could do it by yourself.

LauraGPT avatar Oct 12 '24 06:10 LauraGPT

Hi, if we using vad model first?

MonolithFoundation avatar Oct 12 '24 06:10 MonolithFoundation

Why doesn't support speaker for whisper?

Whisper models lack timestamps for speaker recognition.

截屏2024-10-12 14 12 36 the latest turbo version could be made to predict timestamps according to their model card from huggingface.co.

The timesptamp of whisper is sentence-level. However, the timestamp of speaker recognition should be word-level. If you are interest in that, maybe you could do it by yourself.

thanks, very impressive

TurboMa avatar Oct 12 '24 06:10 TurboMa

Hi, still can not understand, why speaker recognition must be word level?

MonolithFoundation avatar Oct 12 '24 07:10 MonolithFoundation

你好,请问这个错误该如何处理: 文件“/tests/test_speakersep.py”,第 97 行,在 get_asr_spk 中 res = self.model.generate( ^^^^^^^^^^^^^^^^^^^^^^^ 文件“/FunASR/funasr/auto/auto_model.py”,第 303 行,在 generate 中 return self.inference_with_vad(input, input_len=input_len, **cfg) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“/FunASR/funasr/auto/auto_model.py”,第 553 行,在 inference_with_vad 中 sv_output = postprocess(all_segments, None, labels, spk_embedding.cpu()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 文件“FunASR/funasr/models/campplus/utils.py”,第 117 行,在后期处理中断言 len(segments) == len(labels) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError 0%|

我也是

请问这个问题解决了吗

johnhula avatar Nov 25 '24 09:11 johnhula

No, I actually build my own pipeline.

Sevlrio + paramformer is good. fsmn is not very good.

MonolithFoundation avatar Nov 25 '24 11:11 MonolithFoundation