SenseVoice
SenseVoice copied to clipboard
时间戳数量和字符数量不匹配【词粒度模式返回时间戳】
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
Before asking:
- search the issues.
- search the docs.
What is your question?
发现对于某些音频(如asr识别结果有特殊符号的话),sensevoice返回字符数量和时间戳数量会不一致?相比字符数量,时间戳数量上会偏少一些。有办法对齐两者的输出嘛?因为想试一试sensevoice的说话者分离效果怎么样,首先会用到词粒度时间戳。 参考:
- https://github.com/modelscope/FunASR/pull/2413
- https://www.modelscope.cn/models/iic/speech_campplus_speaker-diarization_common
Code
from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess
import re
model_dir = "iic/SenseVoiceSmall"
model = AutoModel(
model=model_dir,
vad_model="fsmn-vad",
vad_kwargs={"max_single_segment_time": 30000},
device="cuda:0",
)
audio_files = [
'https://www.modelscope.cn/models/iic/speech_campplus_speaker-diarization_common/resolve/master/examples/2speakers_example.wav'
]
res = model.generate(
input=audio_files, # 1分钟以上长音频
cache={},
language="auto", # "zn", "en", "yue", "ja", "ko", "nospeech"
use_itn=True,
batch_size_s=60,
# merge_vad=True, #
merge_vad=False, #
merge_length_s=15,
output_timestamp=True
)
text = res[0]["text"]
timestamp = res[0]["timestamp"]
text = rich_transcription_postprocess(text) # 对于上面的例子音频,解开注释,数量上一致;但对于另外一些音频数量上会对不上!
print(f'text: {text}')
print(f'timestamp: {timestamp}')
print(len(text), len(timestamp))
What have you tried?
What's your environment?
- OS (e.g., Linux): ubuntu20.04
- FunASR Version (e.g., 1.0.0): 1.2.6
- ModelScope Version (e.g., 1.11.0): 1.18.0
- PyTorch Version (e.g., 2.0.0): 2.6.0
- How you installed funasr (
pip, source): pip - Python version: 3.12
- GPU (e.g., V100M32): gtx4090t
- CUDA/cuDNN version (e.g., cuda11.7): 12.4
- Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1): not docker
- Any other relevant information:
I have the same problem, have you solved it?
I have the same problem, have you solved it?
I raised a PR that fixed this issue. https://github.com/modelscope/FunASR/commit/8b0fb74bded1f8a162e6c0e94c3522be6216ea03
我遇到了同样的问题,你解决了吗?