wsd12345
wsd12345
> Your reference audio length should not exceed 90 seconds. Thanks for your response. I use 23s of reference audio。In decode_n_tokens,I find that stop sign is not predicted. The dimensions...
> You 100% need multinomial sampling, argmax will cause repetition pattern. The generated audio waveform is indeed repetitive noise at the back. But I don't know why it keeps repeating,...
> Using greedy for any LLM can also meet same issue. Thanks。
[Model and test code](https://github.com/wsd12345/sense_triton_question.git) The above problem uses test/1.py. The deployment server is in the cloud, Tesla V100-SXM2-16GB. And, I found the following problems: I run test/2.py, that turned out...