metavoice-src icon indicating copy to clipboard operation
metavoice-src copied to clipboard

Does it support Arabic

Open Qt4arab opened this issue 1 year ago • 3 comments
trafficstars

I have 50k high quality Arabic dataset,is possible to train the model on Arabic?

Qt4arab avatar Feb 07 '24 09:02 Qt4arab

See comment here #6

sidroopdaska avatar Feb 09 '24 01:02 sidroopdaska

I've added some initial pointers to this here: https://github.com/metavoiceio/metavoice-src/issues/70#issuecomment-1957337895

vatsalaggarwal avatar Feb 21 '24 17:02 vatsalaggarwal

Hey @Qt4arab , we've just published an initial approach for finetuning the last N transformer blocks of the first stage LLM. Best to play around with the hyperparams in finetune_params.py as we didn't determine the optimal set. Let us know if you have any issues or if you're up for contributing any improvements (via param sweep or otherwise!)

Next step to improve finetuning effectiveness is to have LoRA adapters for the first stage LLM which is being worked on here.

lucapericlp avatar Mar 14 '24 13:03 lucapericlp