Herman Semenoff comments

Results 291 comments of


                                            Herman Semenoff

Reduce enum sizes some are used in structs, which allowed them to be optimized.

> /home/usbhost/llama.cpp/ggml/src/ggml.c:1568: GGML_ASSERT(type > 0 && type < GGML_TYPE_COUNT) failed I also had this assert triggered as an error, so I fixed it, apparently I need to remove check >...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

@USBhost, New force push commits allows not to change standard C99 to C23, which makes support for older systems and compilers. I tested on Debian 12 with flags, GGML_CPU, GGML_BLAS,...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

@USBhost, Nice 44/43 CI successful checks, this PR is ready for code review and testing. Last unittest llama2 conversation error: ``` 0.00.001.634 E llama_model_load: error loading model: error loading model...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

@USBhost, # **clang version 19.1.7 (3) for x86_64-pc-linux-gnu** ## Master Host: Debian Sid Testing ``` $ ./llama-cli -m /home/debian/.codegpt/models/gguf/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf -p "I believe the meaning of life is" -n 72 build:...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

72 threads is overhead Linux, `$(nproc --all)` crashing system. ## Master ``` $ ./llama-bench -m /home/debian/.codegpt/models/gguf/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf -t 48 -r 10 -pg 512,128 | model | size | params | backend...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

# **GCC (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu** ## Master ``` $ ./llama-cli -m /home/debian/.codegpt/models/gguf/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf -p "I believe the meaning of life is" -n 72 build: 5170 (658987cf) with cc (Debian...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

Still need to test MSVC compiler on Windows, I won't be able to fully test llama.cpp on virtual machine.

Reduce enum sizes some are used in structs, which allowed them to be optimized.

> Well it does not seem to make inference speed any faster. @USBhost, Is it under Windows with MSVC? In other words, did PR changes affect only on Clang build?

Reduce enum sizes some are used in structs, which allowed them to be optimized.

> Proxmox GCC 12. Same computer as on your older PR. I didn't mean master branch. It may also be necessary to check the old Clang and GCC, there is...

Reduce enum sizes some are used in structs, which allowed them to be optimized.

# **gcc-13 (Debian 13.3.0-13) 13.3.0 for x86_64-linux-gnu** ## Master ``` $ ./llama-cli -m /home/debian/.codegpt/models/gguf/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf -p "I believe the meaning of life is" -n 1024 -no-cnv build: 5170 (658987cf) with gcc-13...