Georgi Gerganov

Results 1015 comments of Georgi Gerganov

Yes that's correct. In some places the necessary checks / asserts are missing.

I'm afraid it will be difficult for me to help here, because I don't have a multi-GPU system to test with and I am not very familiar with this code....

Hi Sara, thanks for the interest! I guess just a link to the repo would be good enough

@FSSRepo would you like to review this PR? I think you are planning to add `ggml_pad` - see if the 2 things play along together

Great - very useful! This is exactly what we need to get us started. This will stay a bit in the background for some time as there are more pressing...

Thanks! Looks interesting - will give it a try tomorrow and share it around

Maybe try to make the MPT example auto detect. I guess in long term we should just add MPT support to `llama.cpp`. For example, here is ongoing work to add...

> As evident methods in ggml-alloc.c, e.g. ggml_gallocr_reserve_n, use directly stderr. So that begs the question whether the internal logging in llama.cpp should not be made commonly available, e.g. in...

On AMD Ryzen 9 5950X and M2 Ultra `SOFT_MAX` is about ~1.5x faster than `master` Using the following command to benchmark: ```bash make -j tests && ./tests/test-backend-ops -o SOFT_MAX -b...

@LostRuins Which instruction set do you observe to fail (ARM, AVX, ..)?