rallio

Results 11 comments of rallio

New to using github. I have attached the notebook for the generation of questions, the correct answers, and the closed book answers using T5. Am working on some ways to...

The model config.json have a notable difference between the roberta-base and my new pretrained roberta model. max_position_embeddings in roberta-base is equal to 514, while in my new pretrained model it...

Any updates on this? Would appreciate any help to identify the source of this bug.

Maybe there is some misunderstanding in what I posted. To the best of my knowledge I am using an unmodified, default training script from huggingface on a plain text file...

The troubleshooting I did myself on this makes me think it has something to do with the special tokens being attention masked in the training dataset preparation. Normally masking special...

The roberta-base and roberta-large models on huggingface when used with `model.generate` does properly create the BOS/EOS tokens. The output from my checkpoints inserts an extra first and last token, but...

I think it should be fixed now. I ran pre-commit and added the .md file and changed to have correct folder structure. Let me know if this works now. TwoDukes...

> You need a 24 gigabyte card and you need to use bfloat16. RTX3090 and other ampere level cards works with 24 gigabyte memory like the A10, A100, A5000 etc.

Check the discord or ping me if you want to do some testing with an early api and model version.