Alan Fang comments

Results 35 comments of


                                            Alan Fang

语言模型解码错误

hello， @Slyne ，我看了你的readme中的语言模型构建部分，也尝试做了srilm和kenlm工具构建的arpa效果对比实验，我发现其实两种工具都是可以支持语言模型构建的，前提是需要把语料按单字分割；但是我遇到一个问题，我尝试用srilm去做了语言模型插值实验，融合了两个arpa文件后，这个语言模型解码结果经常会只出前面的部分，导致cer非常差，插值为什么会对解码有影响呢

语言模型解码错误

补充一下，插值之后是报了这个错：line 7: warning: non-zero probability for in closed-vocabulary LM BOW numerator for context "" is -0.4 < 0

WARNING NaN or Inf found in input tensor.

I also encountered the same problem, I'm trying to add some noise wavs to \ label, however, when the time of wav is less than 0.5 seconds, the warning comes,...

How to add new words during fine-tuning？

freeze other modules except the outputlayer(ctc output && attention decoder output), add new words to your unit.txt ,modify the output size and then tune the model

[transformer] Add moe_noisy_gate

> 测试了一下noisy-moe在decoder的性能，感觉跟大模型一样，用在decoder的表现会更好 > > encoder-decoder的moe我显存不够跑不了，还需要各位大佬来验证一下效果了 > > ctc_prefix_beam_search att_rescoring > U2++-baseline 5.80% 5.06% > Normal Gate-Encoder 5.60% 5.23% > Noisy Gate(decode)-Encoder 5.62% 5.27% > Noisy Gate(only train)-Encoder 5.62% 5.27%...

为什么fine-tune过程中loss会忽大忽小呢？

我有个问题，如果int8加载的参数是freeze掉的，还会影响训练吗，因为看官网介绍，v100其实是不支持int8 tensor cores

为什么fine-tune过程中loss会忽大忽小呢？

> @fclearner 你说的方式就是正常的qlora+fp16/fp32优化器的方式。如果是qlora的话，按照之前的写法用的是默认的fp16/fp32的混合精度优化，本质上是只有lora那块需要训练，lora那块的梯度其实是fp16。显存里保存的模型本身是int8，而这一块确实是freeze住的，只有计算的时候会dequant出来。v100不支持int8，可能导致dequant计算的这一部分，速度和精度都不行嗯嗯，感谢解答，我对int8 tensor core和int8的定义有些困惑，我之前理解int8 tensor core是用于训练加速的，int8是正常推理计算的，所以我觉得freeze掉int8相关参数，这部分参数不会参与训练加速的部分，所以不会有异常，但看你的描述，计算部分的int8 v100也是不支持的是吗，那这个图里的v100 int8支持具体是什么意思呢，期待你的回复！ ![image](https://github.com/user-attachments/assets/f2e43424-d0a6-47ce-948f-1e3dca1759ea)

Alan Fang

语言模型解码错误

语言模型解码错误

WARNING NaN or Inf found in input tensor.

How to add new words during fine-tuning？

[transformer] Add moe_noisy_gate

为什么fine-tune过程中loss会忽大忽小呢？

为什么fine-tune过程中loss会忽大忽小呢？

为什么fine-tune过程中loss会忽大忽小呢？

fsmn-vad是否支持endpoint检测呢

fsmn-vad是否支持endpoint检测呢