Arthur comments

Results 795 comments of


                                            Arthur

Add "Fill-in-Middle" pipeline

No worries, ping me whenever for another review!

Add "Fill-in-Middle" pipeline

IMO that is exactly the purpose of this pipeline. The functions should not necessarily have been part of the tokenizer as they are only need for the FIM task. So...

Add "Fill-in-Middle" pipeline

You should be able to use 4bit quantization!

Add "Fill-in-Middle" pipeline

Sounds good. No need for the generation config update . Tokens are string so should be saved in the tokenizer_config.json IMO

Add "Fill-in-Middle" pipeline

Yes, you can probably open a PR to the models and use the `revision`! WDYT?

Llama inference instability in fp16 producing inf in the middle of the model

I think beam search with ROPE and fp16 has instabilities yes, reported here: #26332 if I am not mistaken this is what we have no? And I think a recent...

Llama inference instability in fp16 producing inf in the middle of the model

I think computing ROPE in float32 percision should partly fix this

Llama inference instability in fp16 producing inf in the middle of the model

I'll mark this as closed, because llama now computes rope in float32! 🥳 Feel free to ping me if you feel like this should not be closed

[RWKV] Add RWKV5 model and RWKVWorldTokenizer

Hey feel free to ping me when this is ready! 🤗

[RWKV] Add RWKV5 model and RWKVWorldTokenizer

Ok! Thanks, I'll review now, but will let @amyeroberts handle the rest as I'll be off for a week 😉