Serge Gotsuliak

Results 40 comments of Serge Gotsuliak
trafficstars

I've adopted this advice: > We found a larger rank (e.g. 256) and higher learning rate (e.g. 2e-4) worked best. See here: https://github.com/allenai/open-instruct

@tsingcoo Did you try AdaLoRA with Factory by yourself? How great is it compared to normal LoRA?

So for now cutrom prompting for so popular ShareGPT format datasets is not working at all?

I'm using Unsloth not initializing any collators explicitly within my code. So basically trainer just process raw texts without any prompts and masks: ``` trainer = SFTTrainer( dataset_text_field = "text",...

I'm evaluating different open datasets, converting all of them to ShareGPT format for easy of use. Like this one, where `conversations` column contains `system` / `human` / `gpt` parts of...

There special token for masking inputs: `IGNORE_TOKEN_ID = -100`

@danielhanchen sorry, nothing particularly interesting there. Packing = False and there no any juggling with attention / inputs matrixes Been there, done that :) There will be trashy model as...