crabml Any speed testment?

Any speed testment?

Open lucasjinreal opened this issue 1 year ago • 1 comments

Any speed testment?

Apr 15 '24 03:04 lucasjinreal

i have a small benchmark script to compare the performanc e between llama.cpp (built with LLAMA_NO_METAL) and crabml on running CPU inference: https://gist.github.com/flaneur2020/27a384e8a6eae8963491c0bbf6bb9033

it seems that crabml could out perform llama.cpp on generating 100 tokens with gemma 2b and openllama 3b on my m1 laptop in a token-after-token basis:

llama.cpp	crabml
3.106 	| 3.132
3.155 	| 3.069
3.175 	| 3.102
3.191 	| 3.070
3.248 	| 3.041
3.215 	| 3.121

however the the prompt processing speed is still under optimization in crabml, we'd like to consider the approach in https://justine.lol/matmul/ to accelerate the batched prompt processing.

also, the GPU acceleration is still WIP, i'd like to make a better performance report after the GPU part get more useable.

Apr 18 '24 04:04 flaneur2020

crabml crabml copied to clipboard

Any speed testment?

crabml
crabml copied to clipboard