question_generation icon indicating copy to clipboard operation
question_generation copied to clipboard

support languages other than English

Open zolekode opened this issue 5 years ago • 13 comments

Hi @patil-suraj your library is really amazing and i would like to contribute. Any tips on how to train say for example in other languages ?

zolekode avatar Jul 10 '20 09:07 zolekode

Thank you @zolekode .

For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.

Without a pre-trained model the quality of the questions won't be as good as it is now.

patil-suraj avatar Jul 10 '20 09:07 patil-suraj

@patil-suraj makes sense. Thanks alot. I will look into that

zolekode avatar Jul 10 '20 09:07 zolekode

@patil-suraj We do have KorGPT, https://github.com/SKT-AI/KoGPT2 and https://korquad.github.io/. In this case, could you help to generate QAs for Kor?

hunkim avatar Jul 11 '20 02:07 hunkim

Hi @hunkim , I do want to add support for GPT-2 in a week or two. Do you think you can upload this model on HF model hub so that it can be easily integrated here ?

As GPT-2 is not a seq-to-seq model, we'll need a different way to do QG with GPT-2. One way we can do answer aware QG with GPT-2 is prepare our input like this Our context is 42 is the answer to life, the universe and everything, answer is 42 and target question is What is the answer to life, universe and everything ?

Then input text: context: <hl> 42 <hl> is the answer to life, the universe and everything. question: What is the answer to life, universe and everything ?

and prepare the attention mask such that, there will be no attention from question: ... part, so model won't look into future tokens and calculate loss only on the question: ... part. And it inference time we will feed only the context part and ask the model to generate the question.

Feel free to take a stab.

patil-suraj avatar Jul 12 '20 06:07 patil-suraj

Thank you @zolekode .

For training in other languages we'll need a pre-trained model and dataset in that language. I'm not sure if there are pre-trained seq-to-seq models available for other languages right now.

Without a pre-trained model the quality of the questions won't be as good as it is now.

Thanks for the library. Having the dataset (wikipedia/news articles) in another language, how can I go about fine tuning t5 to it? Thanks!

epetros avatar Jul 15 '20 14:07 epetros

Hi @patil-suraj Thanks a lot for sharing this code. It's awesome. As mT5 has been released a few weeks ago by Google and as Huggingface is working to add it into their library, do you plan to upgrade your repo with multi-lingual support ? Cheers Philippe

Neuronys avatar Nov 05 '20 07:11 Neuronys

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier

mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.

It would be awesome if you guys could send PR to add support for mT5 :)

patil-suraj avatar Nov 17 '20 17:11 patil-suraj

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier

mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break.

It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

zhoudoufu avatar Apr 19 '21 03:04 zhoudoufu

@patil-suraj wonderful news!

@zhoudoufu Could you share the Chinese version? I will modify it for a Korean version. Thanks!

hunkim avatar Apr 19 '21 03:04 hunkim

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

@zhoudoufu Can you tell me how to prepare the fine-tuned data? Thanks

@patil-suraj - Is there a support for generating questions in Hindi available ?

shaktisd avatar Aug 19 '21 06:08 shaktisd

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

I have tested mt5, loss is 0.4, it is still training. Do you mind if we can discuss it in chinese?

FayeYGY avatar Oct 09 '21 05:10 FayeYGY

@hunkim , @ghost, @zolekode , @epetros, @overnightJustifier mT5 has just been added to transformers, so now you can use it to fine-tune on your own language if mT5 supports it. See https://discuss.huggingface.co/t/mt5-t5v1-1-fine-tuning-results/2098 I haven't tested this codebase against transformers master so things might break. It would be awesome if you guys could send PR to add support for mT5 :)

Hi @patil-suraj, I managed to finetune with MT5 on Chinese based on your script which gives valid output, but I still wondering where is the ceiling of the performance. I suppose mT5 won't surpass T5, my loss is around 0.2, how much is your best performance with finetuned mdl with T5 ?

Hi @zhoudoufu , Were you able completely finetune mt5 model on chinese? Can you please share your training pipeline so that I can try it out for other languages also?

sabhi27 avatar Feb 17 '22 13:02 sabhi27