gaochangfeng comments

Results 12 comments of


                                            gaochangfeng

可以返回情感识别结果的置信度吗？

```ctc_logits = self.ctc.log_softmax(encoder_out)``` 第2个token为情感概率分布 https://github.com/FunAudioLLM/SenseVoice/blob/969634be261bf30dc3aea3ba317a45d3882f8c52/model.py#L855

【请教】数据集的各个字段解释

“event_target”: 音频事件，“with_or_wo_itn”：是否对文本进行正则化（添加标点、文字变数字）

how to set skip_special_tokens and timestamp level?

> How to implement timestamp function? Would you give me some ideas use the forced_align provided by the torchaudio like: ``` alignment, scores = torchaudio.functional.forced_align(ctc_probs, preds.unsqueeze(0), None, None, blank=0) ```

一共有哪些情感token？

happy sad angry neutral fearful disgusted surprised unknown，前四个效果较好

ban_emo_unk doesn't work

ban_emo_unk是强制输出情感而不是禁用情感。情感识别没有增加计算量，手动删除情感标签即可

返回的文本有一些表情符号，这种怎么去掉

使用正则表达式或者str.replace()删除即可

返回的文本有一些表情符号，这种怎么去掉

> > 使用正则表达式或者str.replace()删除即可 > > 请问一下，这些表情的作用是什么？声音事件和情感。

Can we generate the transcript including audio events?

the 2nd output token is the event token

SenseVoiceSmall微调是否支撑增加事件/情绪/语言类型

SenseVoice预留了specialtoken进行功能扩展，使用```[tokenizer.ids2tokens(idx) for idx in range(tokenizer.get_vocab_size()) ]```查看, 未使用的token被命名为SPECIAL_TOKEN_X, 建议使用SPECIAL_TOKEN_15之后的token进行扩展