Vincent Bosch comments

Results 32 comments of


                                            Vincent Bosch

bug: Performance drop on the second request

> @vlbosch could you please try our latest nightly build version? I and some other noticed that the issue seems to be fixed and quite stable, at least on macOS,...

Apple Silicon Support

@danielhanchen I have an M3 Max Macbook Pro with 128GB Unified RAM. Happy to help you port this great project to Apple Silicon 😃 If it's any helpful, I can...

GPU Usage dropping before completion ends

I am having the same issue on M3 Max with 128GB RAM and Mistral Large-2407 in 4 bit. The model can be loaded with llama.cpp and HF Transformers using all...

GPU Usage dropping before completion ends

I am using asitop to watch the activity. Up until the tokens are generated 99/100% GPU is used with max Mhz. Right after the first token is streamed, the GPU-usage...

GPU Usage dropping before completion ends

As per your suggestion in the other issue @awni I updated to (the latest) macOS Sequoia preview, but the issue persists. After a reboot and loading a large model like...

GPU Usage dropping before completion ends

> Did you try setting the sysctl `sudo sysctl iogpu.disable_wired_collector=1`? That usually helps. Thanks! I can confirm this command works on macOS 15 DP 5, although it didn't work on...

GPU Usage dropping before completion ends

The token generation speed is not different from macOS 14.6 after a fresh start. GPU Utilization shows 100% continuously throughout the generation. I am however under the impression that the...

Compilation fails on macOS due to .zip(devices)

@EricLBuehler Thanks for the quick reply! I can confirm that master builds correctly now. Maybe another issue, or I don't understand how ISQ works, but when trying to run a...

reproducing eagle on mistral-7b-v0.3-instruct

Did you guys manage to successfully reproduce EAGLE 2 with Mistral? If so, I am curious as to the changes/settings that yield the best results. I'd like to train EAGLE...

[FEAT]: Multilingual Native Embedder

I would also like the option to add another local embeddings model, like for example BGE-M3. I tried adding it in the models-folder myself, but couldn't get it to work...