Ma Mingfei comments

Results 93 comments of


                                            Ma Mingfei

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@slaren just updated cmake compiler options: `-mamx-tile`, `-mamx-int8` and `-mamx-bf16`!

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@ggerganov could you please tell me why this CI failed? https://github.com/ggml-org/ci/tree/results/llama.cpp/28/cfc0ffdbfec9520a2c190d57025350229d340c/ggml-4-x86-cuda-v100

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@slaren after changing `is_host` to false from the AMX backend leads to an fault from `ggml_backend_sched_backend_id_from_cur` (log attached below). Do you have any insight how to fix it? ``` llama_new_context_with_model:...

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@slaren could you please help review this one again? just changed `ggml_backend_buft_is_host` to return false for amx backend.

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@ggerganov changed to tab with 4 spaces. also the branch is rebased to squash into one.

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@slaren thanks a lot for your review!

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@nai-kon originally i wrote kernels for ``` //(type == GGML_TYPE_Q8_0) || //(type == GGML_TYPE_Q4_K) || //(type == GGML_TYPE_Q5_K) || //(type == GGML_TYPE_Q6_K) || //(type == GGML_TYPE_IQ4_XS); ``` but later on,...

Add Intel Advanced Matrix Extensions (AMX) support to ggml

@slaren the major problem that i have with ggml-backend is I didn't figure out how to do padding with the AMX backend (when the packing weight for AMX, e.g. vnni...

Add Intel Advanced Matrix Extensions (AMX) support to ggml

This PR also adds openmp support since the original pthead sync is done via atomic which has a very high overhead on server CPUs (and the sync has to be...

Add Intel Advanced Matrix Extensions (AMX) support to ggml

> BTW why AMX will greatly improve next token latency? I also wrote an vnni kennel for gemv cases.