Jesse

Results 99 comments of Jesse

@magikRUKKOLA Sorry I didn't get to it last night. I tried to apply these diffs this morning and my patch tool is saying: ``` patch: **** Only garbage was found...

Ah. There are mixed tab and space characters in those files and therefore the diffs that are making them hard to apply - probably because github or markdown is stripping...

Tested with: ```bash python ktransformers/server/main.py \ --port 11434 \ --model_path /data/DeepSeek-V3 \ --model_name "DeepSeek-V3-0324:671b-q4_k_xl" \ --gguf_path /data/DeepSeek-V3-0324-GGUF-UD/UD-Q4_K_XL \ --optimize_config_path /home/jesse/ktransformers/ktransformers/ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml \ --temperature 0.3 \ --cpu_infer 30 \ --cache_lens 131072 \...

> I am pretty sure that the problem is similar to what is going on with balance_serve and R1/V3 can fix it. But should we do that? The thing is...

> Lets see if the authors will merge the PR you made. If not, I am forking it lol and doing everything properly. I wish I knew enough about LLM...

`llama.cpp` is faster for me in tok/s than `ktransformers`, due to the way it handles NUMA. I can run it with NPS4 and tune the number of CPU threads to...

Hey @KiruyaMomochi. I wrote the latest iteration of the DS 3.1 tool calling code. I wrote a ton of unit tests. Strongly recommend writing your own in the same style...

https://github.com/ggml-org/llama.cpp/pull/16932 sort of works, but in my testing with Open Hands it keeps stopping for some reason. I have to type "continue" constantly and it gets stuck in repetitive loops....