lm-question-generation icon indicating copy to clipboard operation
lm-question-generation copied to clipboard

Can you provide a more detailed guide to fine-tuning operations?

Open chenzebiaohub opened this issue 9 months ago • 9 comments

I want to fine-tune a mt5 Chinese model on my own dataset! Extremely grateful.

chenzebiaohub avatar May 07 '24 17:05 chenzebiaohub

Hey, do you have your dataset on HuggingFace?

On Wed, 8 May 2024 at 13:37, 小陈 @.***> wrote:

@asahi417 https://github.com/asahi417

— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099725254, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDHOM3DPB3MD224BJZLZBGTYBAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG4ZDKMRVGQ . You are receiving this because you were mentioned.Message ID: @.***>

asahi417 avatar May 08 '24 05:05 asahi417

hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning

chenzebiaohub avatar May 08 '24 05:05 chenzebiaohub

Cool. Are you gonna load the dataset locally via HuggingFace dataset? If it uses the HuggingFace dataset, it should be pretty straightforward, but otherwise can be a bit tricky.

On Wed, 8 May 2024 at 14:08, 小陈 @.***> wrote:

hello! thanks your reply! I have the dataset organized ~ not uploaded to huggingface at the moment, would like to load the dataset locally for mT5 fine-tuning

— Reply to this email directly, view it on GitHub https://github.com/asahi417/lm-question-generation/issues/23#issuecomment-2099750423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEEXCDAQRSUHA4UC5YXDZ4TZBGXMZAVCNFSM6AAAAABHLM45SSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJZG42TANBSGM . You are receiving this because you were mentioned.Message ID: @.***>

asahi417 avatar May 08 '24 05:05 asahi417

Very excited to see your feedback! I will upload the dataset to huggingface, how do I need to fine tune it after that?

chenzebiaohub avatar May 08 '24 05:05 chenzebiaohub

@asahi417

chenzebiaohub avatar May 08 '24 05:05 chenzebiaohub

Try the following command.

lmqg-train-search -c "tmp" -d "{your-hf-dataset-alias}" -m "mt5-small" -b 64 --epoch-partial 5 -e 15 --language "zh" --n-max-config 1 -g 2 4 --lr 1e-04 5e-04 1e-03 --label-smoothing 0 0.15

That will launch the fine-tuning with hyperparameter gridsearch. you might need to play around with the parameter. See the full description at lmqg-train-search -h.

asahi417 avatar May 08 '24 06:05 asahi417

For example, following command is used to train https://huggingface.co/lmqg/mt5-base-zhquad-qag.

LA='zh'
MODEL="google/mt5-base"
MODEL_SHORT='mt5-base'
lmqg-train-search --use-auth-token -d "lmqg/qag_${LA}quad" -m "${MODEL}" -b 8 -g 8 16 -c "lmqg_output/${MODEL_SHORT}-${LA}quad-qag" -i 'paragraph' -o 'questions_answers' --n-max-config 2 --epoch-partial 5 -e 15 --max-length-output-eval 256 --max-length-output 256 --lr 1e-04 5e-04 1e-03

asahi417 avatar May 08 '24 06:05 asahi417

See more example here

https://github.com/asahi417/lm-question-generation/blob/master/misc/2023_acl_qag/model_finetuning.end2end.sh

asahi417 avatar May 08 '24 06:05 asahi417

Very excited to see your feedback! THANKS

chenzebiaohub avatar May 08 '24 07:05 chenzebiaohub