WrViajero comments

Results 10 comments of


                                            WrViajero

Loss computation in finetune

You might find an answer by checking the output of a dataset, i.e. to look at a concrete sample and its input_ids and labels, which is exactly what huggingface.trainer class...

Has anyone attempted to fine-tune the Starcoder model with your own code?

May I know the manner you fine-tune your model and the way you estimate your performance of it, i.e. with fill-in-middle task or next-code-sentence prediction task? AFA I can know,...

Starcoder generates some junk output

Yes, I am also encountering a similar issue. Did you try to run quantization to the model?

finetune time

Actually I have to accelerate my model to multiple GPUs, and otherwise it leads to a cuda OOM error, when fine-tuning on 4 rtx 6000 48 GB GPU cards. However,...

Can/How StarCoder model can be used for encoding?

Any luck with this?

Fine-tuning on other programming languages than Python

By the way, it is because our team is more often working with Gaming design languages like C-sharp.

Fine-tuning on other programming languages than Python

Thanks a lot for the update. I am not familiar with code implementations in Megatron-LM, however. May I know with which file(s) should I follow to understand the fine-tuning logic?...

Fine-tuning on other programming languages than Python

Hi, I am looking into the examples directory and will update this ticket once it has been executed properly. Thanks.

Fine-tuning on other programming languages than Python

Hello, my fine-tuning snippet is working now and you may close this ticket. Thank you for the help!

Possibility to foldseek against only a PDB-chain

I suppose it will certainly work! Can you try modifying the input PDB and discard all those chains except chain B?