AlpinDale
AlpinDale
Oh sorry, didn't mean to do that. :P
Hi @HaiShaw Triton doesn't seem to support mixed precision dot product, so this kernel here fails if the `k` is uint8 and `q` is another precision. I've been trying to...
@krrishdholakia hi sorry for the late reply. I'd assume the LiteLLM OpenAI endpoint doesn't support any samplers beyond what OpenAI itself provides. Is that true? If not, I suppose we...
Now that triton has wheels upstream for musl, I believe this PR should be closed.
Support being added in #8751
It would be great if Megatron-LM could support PEFT methods, e.g. QLoRA. We're sorely lacking a PEFT trainer with Tensor Parallelism.
Hi! What's the status on this PR? I'd like to train a few speculator models, but I'm not sure how to get started, due to a lack of documentation...
Thanks for the reply, @JRosenkranz I'd love to wait but I have access to a large cluster of H100s for a limited time, so I wanted to make the most...
Looks like installing flash-attn with our torch version doesn't work: ``` ImportError: /home/anon/miniconda3/envs/aphrodite/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi ``` I'll look into it. Thanks for reporting.
I'll get to investigating this soon; I've been busy with other projects so I haven't had much time to work on aphrodite lately. I have an inkling that this is...