Lukas Kreussel

Results 114 comments of Lukas Kreussel

> @LLukas22 what masks are you talking about? @OlivierDehaene , I was referring to the attention masks for batched input, as reconstructing these on the client side could be challenging...

> No, you just need to concatenate the arrays, run the linear layer then index into the result with the length of the different arrays: > > ```python > embed_a...

> OK, the max batch size is like 20 for A100. Interesting, i have no problems running `BAAI/bge-m3` with a batch size of >80 on a single H100. But i'm...

I think what you are refering to are [debugger visualizers](https://rust-lang.github.io/rfcs/3191-debugger-visualizer.html), which allow to customize how the `WinDbg` or `GDB` / `LLDB` debuggers display certain structs. This would allow us to...

> For the original question, we currently implement the `Display` trait to print a tensor in the same way pytorch would. We also have the `Debug` trait that provides a...

> Sorry I'm a bit lazy here and haven't google but would you know if codelldb/... provide some hooks that let you customize the printing of arbitrary data structures. I...

Already on it, got it converted and quantized but it produced gibberish. Im waiting on https://github.com/ggerganov/llama.cpp/issues/1602 to see how they will handle the Q, K, V weights. I dont want...

That's great! Maybe i will create a draft, but i would like to wait until it get's merged into ggml.

Yeah, I noticed that. It would be great if someone could try porting it to Rust. I'm currently quite busy implementing GPU acceleration for all architectures.😬

We should wait until GGUF is implemented and we have all the necessary fields in the model file.