CodeT5 icon indicating copy to clipboard operation
CodeT5 copied to clipboard

Can we fine-tuning the Text-to-Code Retrieval task?

Open pdhung3012 opened this issue 1 year ago • 2 comments

Hello I wonder if we can finetune the text-to-code retrieval task for Text-to-Code Retrieval like UniXcoder at here. I have run the zero-shot code retrieval for Javascript. It shows that the best accuracy I can get for code retrieval is 70.2%, which is lower than the fine-tuned CodeT5+ at 71.3% (reported in CodeT5+ paper at here. So I want to check if I can increase the zero-shot result by fine-tuning.

Thank you

pdhung3012 avatar Aug 31 '23 03:08 pdhung3012

Yes, you can definitely finetune on labeled datasets using contrastive loss (or combined with the matching loss) to further boost the retrieval performance. We plan to release the finetuning scripts in the future if there are many asks for this.

yuewang-cuhk avatar Sep 05 '23 04:09 yuewang-cuhk

Yes, you can definitely finetune on labeled datasets using contrastive loss (or combined with the matching loss) to further boost the retrieval performance. We plan to release the finetuning scripts in the future if there are many asks for this.

I would like to ask if there are now open-source finetune scripts to share for Text-to-Code Retrieval using codet5+, thanks a lot!

gzt4se avatar Jan 04 '24 03:01 gzt4se