HGB

Results 36 comments of HGB

@ridgerchu Please add ROCM support for Triton, there's this official repo that supports Triton with ROCM https://github.com/ROCm/triton Here is an example of the flash attention v2 implementation via triton with...

Will gladly do!

-Triton-nightly 3.0.0 might anyone be of help regarding this ``` root@r4-0:~/matmulfreellm# python generate.py /opt/conda/envs/py_3.9/lib/python3.9/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible....

There are 3 versions of Triton that I have tried: -Triton-nightly 3.0.0, the error I have is above -Triton 2.1.0, which is the ROCM/triton repo/branch that you just mentioned @taylor-shift,...

I haven't used Jax, so I don't know, I ran the official rocm-triton docker image and then ran my test cases from there. ![image](https://github.com/ridgerchu/matmulfreellm/assets/87762857/4bd1c80a-6d6f-4747-b846-1e6132321a93) https://github.com/ridgerchu/matmulfreellm/issues/17 Still there are some verified...

I see, so we would still have to wait for the repo to be fully functionally working with BitBLAS until that we can not experience the results from the paper...

Wait, so you could still train a model and get faster training + vram reduction? It just doesn't work on inference? I might be wrong here but how would we...

Does the Gemma 3 model has some similar issues like the following issue https://github.com/ollama/ollama/issues/9871? I can't seem to use more than 8192 tokens. This seems to be the default for...

@gongchangsui Have you found a fix? I'm getting the same error

Hi @miladm @JackCaoG, appreciate it! Here's the code repo to clone/reproduce from ``` git clone -b TPU https://github.com/radna0/EasyAnimate.git ``` you can follow this guide if you wish to use docker....