rallio comments

Results 11 comments of


                                            rallio

Create synthetic QA dataset (~1k samples)

New to using github. I have attached the notebook for the generation of questions, the correct answers, and the closed book answers using T5. Am working on some ways to...

Create synthetic QA dataset (~1k samples)

Yes I can do that.

Using run_mlm.py to pretrain a roberta base model from scratch outputs do not include <bos> or <eos> tokens

The model config.json have a notable difference between the roberta-base and my new pretrained roberta model. max_position_embeddings in roberta-base is equal to 514, while in my new pretrained model it...

Using run_mlm.py to pretrain a roberta base model from scratch outputs do not include <bos> or <eos> tokens

Any updates on this? Would appreciate any help to identify the source of this bug.

Using run_mlm.py to pretrain a roberta base model from scratch outputs do not include <bos> or <eos> tokens

Maybe there is some misunderstanding in what I posted. To the best of my knowledge I am using an unmodified, default training script from huggingface on a plain text file...

Using run_mlm.py to pretrain a roberta base model from scratch outputs do not include <bos> or <eos> tokens

The troubleshooting I did myself on this makes me think it has something to do with the special tokens being attention masked in the training dataset preparation. Normally masking special...

Using run_mlm.py to pretrain a roberta base model from scratch outputs do not include <bos> or <eos> tokens

The roberta-base and roberta-large models on huggingface when used with `model.generate` does properly create the BOS/EOS tokens. The output from my checkpoints inserts an extra first and last token, but...

A notebook for question and answer generation using one of the most powerful opensource NLU models, FLAN-T5-11B.

I think it should be fixed now. I ran pre-commit and added the .md file and changed to have correct folder structure. Let me know if this works now. TwoDukes...

A notebook for question and answer generation using one of the most powerful opensource NLU models, FLAN-T5-11B.

> You need a 24 gigabyte card and you need to use bfloat16. RTX3090 and other ampere level cards works with 24 gigabyte memory like the A10, A100, A5000 etc.

Setup web API Path that runs a prompt against a live model and returns the results

Check the discord or ping me if you want to do some testing with an early api and model version.