TinyChatEngine
TinyChatEngine copied to clipboard
Block size = 32 assertion fails
I have tried with both LLaMA and VILA models.
Both give this when ./chat is run:
../kernels/avx/matmul_avx_int4.cc:701: void matmul::MatmulOperator::mat_mul_accelerator_int4_fast_no_offset(const matmul_params*): Assertion params->block_size == 32' failed. Aborted (core dumped)
When I print the block_size parameter that comes to the above functions it says 128
.
does anyone know why this happens? How can I define block size as 32?