22quinn issues

Repositories
Issues
Comments

Results 2 issues of


                                            22quinn

Add log1p elementwise op

Summary: `log1p(x)` is more precise than `log(1+x)` when `x` is close to 0. We utilize cuda `log1pf` implementation for fp32. For other precision types, input is first converted to float,...

CLA Signed

fb-exported

[V1] Support bad_words in sampler

Follows the implementation in [vllm/logits_process.py](https://github.com/vllm-project/vllm/blob/main/vllm/logits_process.py) There were two choices on where to tokenise bad words and the second path was taken: 1. Pass both `bad_words` and `tokenizer` via requests, but...