SenseVoice issues

可以实现有标点、数字不转化为阿拉伯数字吗？

use_itn=True, 输出结果为：如果是110加二等于12，2减三等于99 use_itn=False，输出结果为：如果是十十加二等于十二十二减三等于九九数字110、12,2 的 itn结果是错误的。是否可以实现，文本正则化结果+保留标点的输出呢？

TinaChen95

question

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节） ## ❓ Questions and Help 首先感谢开源的senseVoice，识别正确率很高，并且解析效率也很高！这里希望提一个需求：多人对话是一个非常重要的业务场景。希望能够支持多人对话的识别，考虑能够基于对话中每个人的声纹特征将语言识别的内容按人进行区分形成多人对话。这样应用识别完成后，用户可以为每个人标准身份，实现多人对话识别。 ### Before asking: 1. search the...

hehuang139

question

如何限定语言

4

Hi，目前默认的模式经常会将中文识别成日语，有啥办法只输出中文，或者说限定语言吗

MonolithFoundation

模型支持输出文本对应的时间戳吗

11

我想要使用sensevoice用于生成字幕，按照示例代码得到模型输出中没有看到包含时间戳信息

kirayomato

question

【请教】数据集的各个字段解释

1

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节） ## ❓ Questions and Help 计划标注部分数据，但不是很明白各个字段的含义，特请教。 - #### 已基本搞明白的有： "key"、"source"、"target"、"target_len"、“text_language”、“emo_target” - #### 不太明白的有： "source_len"，issue里有解释是“帧”，对应“10ms”，但个别已有数据集中也对不上，所以对“有效音频信息”的理解也许有偏差，求解释。 -...

LateLinux

question

效果很好想使用，请问会涉及商业化侵权吗

yangpeng-space

question

how to set skip_special_tokens and timestamp level?

8

How to set parameters similar to `skip_special_tokens` when generating ASR results? Additionally, does it support ASR results at the timestamp level?

MrRace

Is there a way to return the word timestamp of a sentence?

3

Is there a way to return the word timestamp of a sentence? example: input sentence: "Hello readers,welcome!" output: [{ "word": "Hello", "start_time": 0.02, "end_time": 0.36, }, { "word": "readers", "start_time":...

CuiRobert