Deepak Soma Reddy

Results 4 comments of Deepak Soma Reddy

One each of the worker, i ran "./dllama worker --port 9998 --nthreads 8" on the Root node, "./dllama inference --model models/deepseek_r1_distill_llama_8b_q40/dllama_model_deepseek_r1_distill_llama_8b_q40.m --tokenizer models/deepseek_r1_distill_llama_8b_q40/dllama_tokenizer_deepseek_r1_distill_llama_8b_q40.t --buffer-float-type q80 --nthreads 8 --max-seq-len 4096 --prompt...

@b4rtaz Please find the logs 2xNuC ((12th Gen)) with AVX2 support. --> ![Image](https://github.com/user-attachments/assets/6e56d237-a347-42ed-ae52-71fd9c6559ea) 4xNuC ((12th Gen)) with AVX2 support. --> ![Image](https://github.com/user-attachments/assets/abff3a5a-05d0-47ca-a91e-d45afa42ad86) All 4 NuC are connected via switch.

Thanks @b4rtaz. I trieed connecting two devices directly without a router and results are slightly better. It improved by 1token/sec I see only slightly better results from 5.98 token/sec (with...