Vincent Zhong
Vincent Zhong
## Motivation add the minimum version of gguf as needed. ## Modifications the version SET is based on the link to `transformers` that has a bit more context to the...
## Motivation Motivated by the note in code, we should use `torch.clamp` cc @merrymercy if you could please look. https://pytorch.org/docs/stable/generated/torch.clamp.html ## Modifications use `torch.clamp` for val > The `torch.minimum` function...
## Motivation So Python 3.8 is supported. The versions here, they are not tested like this on the regular CI, so it can be missed but we should be careful....
# Question In persistent QK rms norm what is a real example use case where it beats regular path? https://github.com/flashinfer-ai/flashinfer/pull/1843 PR that introduce this feature (dispatch to either impl) ##...