Eric Buehler issues

Repositories
Issues
Comments

Results 136 issues of


                                            Eric Buehler

Move supports_attn_softmax logic to build.rs

@sgrebnov this is the solution I mentioned in #935. What do you think?

KV Cache Quantization

- [x] Metal kernels - [x] Quantize (f32, f16,bf16) -> (q4_0, q8_0) - [x] Dequantize (q4_0, q8_0) -> (f32, f16,bf16) - [ ] CUDA kernels - [ ] Quantize (f32,...

new feature

Add BNB cuda gemv 4bit kernels

Bitsandbytes quantization ISQ support

Tracking: Metal performance vs. MLX, llama.cpp

This issue serves to track performance on Metal hardware versus MLX and llama.cpp.

optimization

It would be amazing to support [Apple's Metal](https://developer.apple.com/documentation/metal) Shading Language (`.m` and `.metal` files)! It's similar to CUDA and is popular so would be great to have.

[tmp] CUDA allocation log

Add mdbook docs

`mdbook serve mdbook_docs --open `

Eric Buehler

Add an easy cli