CosyVoice
CosyVoice copied to clipboard
add another special_tokens
Hello,
I would like to train a model by adding new tokens and corresponding audio sounds that are not included in the additional_special_tokens of the QwenTokenizer.
For example, I want to add a token like [cry] along with the corresponding crying sound in the training dataset.
Could you advise how much audio data is typically required for the model to accurately generate the intended sound when inputting a custom token such as [cry]?
Would this require retraining or fine-tuning the pre-trained model extensively?
I've already tried training with approximately 200 audio samples but haven't observed any noticeable improvements or desired outcomes.
I would greatly appreciate any suggestions or recommendations you might have.
Thank you!
add our Dingding chat group, maybe 陈谦 can answer your question
This issue is stale because it has been open for 30 days with no activity.
@0913ktg have you able to add new tokens correctly ?
@haziyevv hi, i can't add new tokens.