ncnn [WIP] gemm block quantization for llm decoder style

[ ] move int468 dequant code to x86/arm/... create_pipeline
[ ] how to encode block_size and storage type ?
[ ] try fp4 e2m1/e3 type ?
[ ] comp table ?
[ ] expand to more general gemm ?
[ ] port union hack to platform-independent style
[ ] gemm test++
[ ] doc++

./ncnnllm2int468 qwen3_decoder.ncnn.param qwen3_decoder.ncnn.bin qwen3_decoder-int6.ncnn.param qwen3_decoder-int6.ncnn.bin

Dec 04 '25 11:12 nihui

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Dec 04 '25 11:12 tencent-adm

Codecov Report

:x: Patch coverage is 7.52688% with 86 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 95.88%. Comparing base (37336e7) to head (58cc1f3). :warning: Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
src/layer/gemm.cpp	7.52%	86 Missing :warning:

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6439      +/-   ##
==========================================
+ Coverage   95.62%   95.88%   +0.26%     
==========================================
  Files         844      844              
  Lines      266761   266834      +73     
==========================================
+ Hits       255080   255859     +779     
+ Misses      11681    10975     -706

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:

:snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Dec 04 '25 11:12 codecov-commenter

The binary size change of libncnn.so (bytes)

architecture	base size	pr size	difference
x86_64	15316400	15324592	+8192 :warning:
armhf	6229892	6234020	+4128 :warning:
aarch64	9527616	9527536	-80 :kissing_heart:

Dec 04 '25 11:12 github-actions[bot]