fairydreaming

Results 85 comments of fairydreaming

> I have AMD EPYC 9654 and it has 96 cores 192 threads. When running llama.cpp /main with Yi-34b-chat Q4, the peek inferencing speed tops at around 60 threads. Setting...

I don't think the performance is THAT bad. My platform is Epyc 9374F on Asus K14PA-U12 motherboard with 12 x Samsung 32GB 2Rx8 4800MHz DDR5 RDIMM M321R4GA3BB6-CQK modules. My system...

> @fairydreaming have u tried running the mpirun on the same host by splitting into using 6-8 cpu cores each mpirun instance? on the same host. i think should be...

> @fairydreaming basically i thought running within the same host should be faster than going through 1gbps line for multiple hosts. > > 71.57% is more efficient than the 60%...

These changes cause failed assertions when running Cohere's Command R+ model: ``` main: sgemm.cpp:827: bool llamafile_sgemm(int, int, int, const void*, int, const void*, int, void*, int, int, int, int, int,...

I just tried a naive solution and replaced all ints in sgemm.cpp and sgemm.h with int64_t, and the resulting code works fine without any performance penalty (at least on my...

I checked [t5-base-spellchecker](https://huggingface.co/Bhuvana/t5-base-spellchecker) and it works with [#8141](https://github.com/ggerganov/llama.cpp/pull/8141): `./llama-cli -m /mnt/md0/models/t5-base-spellchecker.gguf -p 'christmas is celbrated on decembr 25 evry ear'` ``` ... llama_output_reserve: reallocating output buffer from size 0.13 MiB...

I have T5 working in llama.cpp, but the code needs to be cleaned up and it still uses additional header file (darts.h - Double-ARray Trie System, MIT license) needed by...

> What functionality does `darts.h` provide? If it is just for performance string searches, we can replace it with some basic naive implementation for start @ggerganov It's a C++ header-only...

Things are going better than expected - I managed to get rid of the `darts.h` dependency and implement necessary functionality. My naive trie implementation is 2x slower compared to `darts.h`...