WrViajero

Results 10 comments of WrViajero

You might find an answer by checking the output of a dataset, i.e. to look at a concrete sample and its input_ids and labels, which is exactly what huggingface.trainer class...

May I know the manner you fine-tune your model and the way you estimate your performance of it, i.e. with fill-in-middle task or next-code-sentence prediction task? AFA I can know,...

Yes, I am also encountering a similar issue. Did you try to run quantization to the model?

Actually I have to accelerate my model to multiple GPUs, and otherwise it leads to a cuda OOM error, when fine-tuning on 4 rtx 6000 48 GB GPU cards. However,...

By the way, it is because our team is more often working with Gaming design languages like C-sharp.

Thanks a lot for the update. I am not familiar with code implementations in Megatron-LM, however. May I know with which file(s) should I follow to understand the fine-tuning logic?...

Hi, I am looking into the examples directory and will update this ticket once it has been executed properly. Thanks.

Hello, my fine-tuning snippet is working now and you may close this ticket. Thank you for the help!

I suppose it will certainly work! Can you try modifying the input PDB and discard all those chains except chain B?