lm-question-generation
lm-question-generation copied to clipboard
Can you provide a more detailed guide to fine-tuning operations?
I want to fine-tune a mt5 Chinese model on my own dataset! Extremely grateful.
Hey, do you have your dataset on HuggingFace?
On Wed, 8 May 2024 at 13:37, 小陈 @.***> wrote:
@asahi417 https://github.com/asahi417
— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099725254, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDHOM3DPB3MD224BJZLZBGTYBAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG4ZDKMRVGQ . You are receiving this because you were mentioned.Message ID: @.***>
hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning
Cool. Are you gonna load the dataset locally via HuggingFace dataset? If it uses the HuggingFace dataset, it should be pretty straightforward, but otherwise can be a bit tricky.
On Wed, 8 May 2024 at 14:08, 小陈 @.***> wrote:
hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning
— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099750423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDAQRSUHA4UC5YXDZ4TZBGXMZAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG42TANBSGM . You are receiving this because you were mentioned.Message ID: @.***>
Very excited to see your feedback! I will upload the dataset to huggingface, how do I need to fine tune it after that?
@asahi417
Try the following command.
lmqg-train-search -c "tmp" -d "{your-hf-dataset-alias}" -m "mt5-small" -b 64 --epoch-partial 5 -e 15 --language "zh" --n-max-config 1 -g 2 4 --lr 1e-04 5e-04 1e-03 --label-smoothing 0 0.15
That will launch the fine-tuning with hyperparameter gridsearch. you might need to play around with the parameter. See the full description at lmqg-train-search -h
.
For example, following command is used to train https://huggingface.co/lmqg/mt5-base-zhquad-qag.
LA='zh'
MODEL="google/mt5-base"
MODEL_SHORT='mt5-base'
lmqg-train-search --use-auth-token -d "lmqg/qag_${LA}quad" -m "${MODEL}" -b 8 -g 8 16 -c "lmqg_output/${MODEL_SHORT}-${LA}quad-qag" -i 'paragraph' -o 'questions_answers' --n-max-config 2 --epoch-partial 5 -e 15 --max-length-output-eval 256 --max-length-output 256 --lr 1e-04 5e-04 1e-03
See more example here
https://github.com/asahi417/lm-question-generation/blob/master/misc/2023_acl_qag/model_finetuning.end2end.sh
Very excited to see your feedback! THANKS