BBC-Esq comments

Results 104 comments of


                                            BBC-Esq

Quantization instructions

Has anyone been able to verify that bitsandbytes, for example, better transformer, flash attention 2, will work with the new version 1.6 llava models?

Request to support FlashAttention in cuda attention.cc

Hello all. Just thought I'd post a question about Flash Attention 2 here: [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention) Apparently it's making big waves and seems seems very powerful. Does anyone plan on seeing if...

Request to support FlashAttention in cuda attention.cc

> Hi @minhthuc2502, Do you have a benchmark comparing Faster Whisper with and without Flash Attention? I haven't benched Whisper in relation to flash attention, but my hypothesis is that...

Request to support FlashAttention in cuda attention.cc

@AvivSham If you're asking for my opinion on how to speed things up generally, faster-whisper has a pull request for batch processing that's not yet approved. If you don't want...

Request to support FlashAttention in cuda attention.cc

BTW, when I said "I can't really help" it's not that I don't want to...it's just that I'm tapped out as far as my personal knowledge...Programming is only hobby for...

Request to support FlashAttention in cuda attention.cc

@AvivSham You might also test your script using beam sizes 1-5 and see if there's a difference? If there's a noticeable difference between using flash attention and not, you could...

Feature request: AMD GPU support with oneDNN AMD support

This is awesome dude. Wish I had programming experience to help with this, but alas I don't. I've been looking for ways to enable gpu acceleration for amd gpus using...

Feature request: AMD GPU support with oneDNN AMD support

Have you gotten it to work at all yet?

Feature request: AMD GPU support with oneDNN AMD support

@arlo-phoenix Can you add the "issues" tab on your github so we can communicate that way? I'm possibly interested in incorporating this into my projects.

Llava support

Here is an update link: https://huggingface.co/liuhaotian/llava-v1.5-7b/tree/main