ivarflakstad comments

Results 12 comments of


                                            ivarflakstad

Matrix multiplication support

Fair enough 😊 Maybe you know who does?

Falcon example seems broken (on metal)

I have [this](https://github.com/huggingface/candle/tree/metal-mfa-bfloat) branch with working bfloat matmul. I'm testing running falcon on it now (downloading) It is based on work I've done [here](https://github.com/ivarflakstad/metal-flash-attention/tree/temp-bfloat-work-stash) which is not ready to be...

Falcon example seems broken (on metal)

If you have enough RAM you should be able to run Falcon on the candle branch I mentioned above. Here I am running Mamba (130m) with bf16:

Metal Backend not properly loading large models at 16GB of RAM

I have a M1 pro 32gb. Metal: 7.30 token/s vs accelerate: 2.28 token/s. Is it still slow for you?

Metal Backend not properly loading large models at 16GB of RAM

I'm on the main brain. That's why I'm asking if it is still slow for you :)

Metal Backend not properly loading large models at 16GB of RAM

There is experimental Metal support. Not using MPS right now - might add it in the future as a fallback for compatability reasons at some point. So yes there is...

Metal Backend not properly loading large models at 16GB of RAM

Memory could be the issue, but then I would expect your computer to be showing signs of that as you are running the model. Is it? For comparison, could you...

Metal Backend not properly loading large models at 16GB of RAM

@bayedieng I recently refurbished the buffer allocator for metal, which is now merged in main - would you mind checking if it has improved the issue? :)

Metal Backend not properly loading large models at 16GB of RAM

Ok thanks. Could you try using `cargo-instruments -t Allocations` and share what it looks like? :)

Add GemmType trait for dispatching gemm fn calls

Hmm I see, that's a valid point. Still I want to emphasize that it is the exact same "bounds" as were there originally - except it was expressed through if...