CosyVoice icon indicating copy to clipboard operation
CosyVoice copied to clipboard

[Question] Add new emotion tags

Open haziyevv opened this issue 6 months ago • 1 comments

@aluminumbox Hello. I wanted to add new emotion tags such as: "[surprise yelp]", "[heavy breathing]", "[excited whoop]" etc.

When I checked I found out that tokenizer has: 151663 tokens, but the LLM("Qwen2ForCausalLM") vocabulary size is 151936

What I did:

  1. I added special tokens, and tokenizer length become 151993
  2. Resized the token embeddings (model.llm.model.resize_token_embeddings(151993))
  3. Finetuned the model
  4. Updated CosyVoice2-0.5B model with finetuned llm.pt and flow.pt
  5. Called inference. But those emotion tags was not created. They were just silence.

haziyevv avatar May 10 '25 07:05 haziyevv

I tried with 1000 samples, but it does not learn.

haziyevv avatar May 14 '25 11:05 haziyevv

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jun 14 '25 02:06 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Jul 02 '25 02:07 github-actions[bot]