Logophoman
Logophoman
I tried out [this stackoverflow post](https://stackoverflow.com/questions/70544129/transformers-asking-to-pad-but-the-tokenizer-does-not-have-a-padding-token) but it didn't help me, but maybe it helps you?
This fixed it for my code: ```python dataset = dataset.map(lambda samples: tokenizer.__call__(samples["text"]), batched=True) tokenizer.pad_token = tokenizer.eos_token tokenizer.add_special_tokens({'pad_token': '[PAD]'}) model.resize_token_embeddings(len(tokenizer)) trainer = transformers.Trainer(...) ``` The key for my issue was to...
Same issue here 😕 - please update this 👍
@Urammar could you also post how much Vram the other 2 Models need? I feel like this could help a lot of people to know what their machine can actually...
【Why would anyone so such a thing as 】a teleprompter? Most of us have seen teleprompters in action. The president or a newsreader stand in front of a screen that...
@allaccs I think the main reason why LLaMa behaves so unexpectedly is due to the fact that no Reinforcement Learning from Human Feedback (RLHF) has been done so far. This...
@ruian0 Could `python -m torch.distributed.run --nproc_per_node 2 example.py --ckpt_dir "/path/to/13B" --tokenizer_path "/path/to/tokenizer.model"` do the trick? I use an older torch version where torchrun is not available and it works for...
I have tried to change the different values and the `temperature` as expected helps the extraction the most, something like 0,1-0,2 works best with the Lama-30b model. The lama model...
@stefangrotz according to [this Wikipedia article](https://en.wikipedia.org/wiki/Alemannic_Wikipedia#Language) the language code `als` was used when there was yet no established distinction between the allemanic dialects. I actually think, while a mixed approach...