SenseVoice icon indicating copy to clipboard operation
SenseVoice copied to clipboard

时间戳数量和字符数量不匹配【词粒度模式返回时间戳】

Open qiutzh opened this issue 7 months ago • 4 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

发现对于某些音频(如asr识别结果有特殊符号的话),sensevoice返回字符数量和时间戳数量会不一致?相比字符数量,时间戳数量上会偏少一些。有办法对齐两者的输出嘛?因为想试一试sensevoice的说话者分离效果怎么样,首先会用到词粒度时间戳。 参考:

  1. https://github.com/modelscope/FunASR/pull/2413
  2. https://www.modelscope.cn/models/iic/speech_campplus_speaker-diarization_common

Code

from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess
import re

model_dir = "iic/SenseVoiceSmall"

model = AutoModel(
    model=model_dir,
    vad_model="fsmn-vad",
    vad_kwargs={"max_single_segment_time": 30000},
    device="cuda:0",
)

audio_files = [
    'https://www.modelscope.cn/models/iic/speech_campplus_speaker-diarization_common/resolve/master/examples/2speakers_example.wav'
]
res = model.generate(
    input=audio_files,  # 1分钟以上长音频
    cache={},
    language="auto",  # "zn", "en", "yue", "ja", "ko", "nospeech"
    use_itn=True,
    batch_size_s=60,
    # merge_vad=True,  #
    merge_vad=False,  #
    merge_length_s=15,
    output_timestamp=True
)
text = res[0]["text"]
timestamp = res[0]["timestamp"]
text = rich_transcription_postprocess(text)  # 对于上面的例子音频,解开注释,数量上一致;但对于另外一些音频数量上会对不上!

print(f'text: {text}')
print(f'timestamp: {timestamp}')
print(len(text), len(timestamp))

What have you tried?

What's your environment?

  • OS (e.g., Linux): ubuntu20.04
  • FunASR Version (e.g., 1.0.0): 1.2.6
  • ModelScope Version (e.g., 1.11.0): 1.18.0
  • PyTorch Version (e.g., 2.0.0): 2.6.0
  • How you installed funasr (pip, source): pip
  • Python version: 3.12
  • GPU (e.g., V100M32): gtx4090t
  • CUDA/cuDNN version (e.g., cuda11.7): 12.4
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1): not docker
  • Any other relevant information:

qiutzh avatar Apr 24 '25 06:04 qiutzh

I have the same problem, have you solved it?

chengligen avatar May 25 '25 08:05 chengligen

I have the same problem, have you solved it?

PL2584718785 avatar May 26 '25 10:05 PL2584718785

I raised a PR that fixed this issue. https://github.com/modelscope/FunASR/commit/8b0fb74bded1f8a162e6c0e94c3522be6216ea03

chengligen avatar May 27 '25 01:05 chengligen

我遇到了同样的问题,你解决了吗?

liuqijie6 avatar Jul 29 '25 05:07 liuqijie6