llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

llama : model-based max number of graph nodes

Open ggerganov opened this issue 1 year ago • 1 comments

fix #8615

Propose to determine the max number of nodes based on the model info (arch, hparams, etc.)

ggerganov avatar Jul 22 '24 06:07 ggerganov

Ok, will merge this after the 405B model is release and the need for this change is confirmed. Likely the proposed n_layer > 400 check would have to be updated, because this number seems too big to me

ggerganov avatar Jul 22 '24 13:07 ggerganov

@ggerganov what's left? up to now?

ceddybi avatar Jul 26 '24 22:07 ceddybi

I haven't noticed any reports of 405B failing, so removed the increased max nodes limit for now and planning to merge just the new llama_model_max_nodes function

ggerganov avatar Jul 27 '24 10:07 ggerganov