apcameron

Results 18 issues of apcameron

Please add OpenCL Support that so that it can be used on GPU's that Support OpenCL and not CUDA

Please add support for Riscv-v based systems

Is it possible to provide an API the mimics the functionality of the OPENAI API?

### Prerequisites - [X] I am running the latest code. Mention the version if possible as well. - [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md). - [X] I searched using keywords...

enhancement

Please add support for the latest Meta Models https://ai.meta.com/blog/meta-llama-3-1/

When I run generation_inference.py I get the error below. RuntimeError: FlashAttention only supports Ampere GPUs or newer. Please add an option to either disable it.

Please add support for Pascal Based Gpu's This used to work in the older versions of flash-attention

Please consider enabling the Use of pytorch's scaled_dot_product_attention as an alternative for those with older GPU's See this example for another product. https://github.com/HiDream-ai/HiDream-I1/pull/27