Stas Bekman comments

Results 664 comments of


                                            Stas Bekman

[Bug]: Multiple inconsistencies wrt BOS injection and BOS duplication

> Also worth mentioning that whether a BOS token is added automatically depends on the particular model (tokenizer). But it's the case for e.g. llama and mistral tokenizers. Once the...

[Bug]: Multiple inconsistencies wrt BOS injection and BOS duplication

Ah, thank you so much Nick, for some reason I was drawing blank on finding how to use it. Now I was able to tell vllm not to add BOS...

[Bug]: Multiple inconsistencies wrt BOS injection and BOS duplication

should the [`LLM`](https://docs.vllm.ai/en/stable/dev/offline_inference/llm.html) API be consistent with the client API and offer `add_special_tokens=False`? I guess it'd belong to `SamplingParams` and not `LLM` `SamplingParams` already has `skip_special_tokens` - but refers to...

Enable torch.autocast with ZeRO

I have a question about `lower_precision_safe_modules` https://pytorch.org/docs/stable/amp.html#torch.autocast doesn't have an option to specify `lower_precision_safe_modules` - why do we then put the onus on the deepspeed user? (and I'm aware that...

Enable torch.autocast with ZeRO

> One confusing aspect is that it's valid to use PyTorch’s torch.autocast while disabling "torch_autocast" in the DeepSpeed config. In this case, DeepSpeed treats it as a pure FP32 setting....

Enable torch.autocast with ZeRO

I'm talking about peak memory usage. if you run torch memory profiler around the reduction call you will see that it'll use `reduce_bucket_size * 4 bytes * 2 copies` -...

Enable torch.autocast with ZeRO

Then assert is the way to go, Masahiro

Enable torch.autocast with ZeRO

Hmm, I can't just hit approve, that would be defeat the purpose of doing the review. We have only discussed one small aspect of this PR, which has been resolved,...

Some adjustment for supporting Deepspeed-Ulysses

BTW, an alternative solution has been just posted here: https://github.com/huggingface/transformers/pull/32305 so perhaps wait before the maintainers decided which version is the most fitting before you invest more time into improving...

Some adjustment for supporting Deepspeed-Ulysses

That's a wonderfully written followup, Elsa @samadejacobs, could you please have a look at Elsa's comments above and see if anything still needs to be added to your PR -...