datadoktergroen comments

Repositories
Issues
Comments

Results 3 comments of


                                            datadoktergroen

4bit inference is slow

How does this deal with the mixed precision problem of merging? Generally the base model is in 4-bit and the LoRA adapter in 16-bit. If you merge this your LoRA...

Pad token issue with Falcon, setting eos = pad leads to generation never stopping, proper fix?

I ran into the same with llama-2. I was wondering, what if you use the BOS token and add the padding on the left?

Pad token issue with Falcon, setting eos = pad leads to generation never stopping, proper fix?

Somehow, manually adding an EOS token "" to the samples fixed it for me. This was despite already having the add_eos=True in the AutoTokenizer. I'm using the DataCollatorForCompletionOnlyLM as collator,...