llama.cpp
llama.cpp copied to clipboard
llama : model-based max number of graph nodes calculation
This fixes #8950 and #8615 This PR builds on top of the changes made in #8622
Calculate the max number of nodes using max(8192, model.tensors_by_name.size()*5)
as recommended by @slaren in https://github.com/ggerganov/llama.cpp/issues/8950#issuecomment-2278807100. I specified 8192 as minimum to ensure this change will not break any currently working models.
Thanks to this change I was able to run inference on BigLlama-3.1-681B-Instruct and Meta-Llama-3-405B-Instruct-Up-Merge booth of which did not load prior to this change.
- [x] I have read the contributing guidelines
- Self-reported review complexity:
- [X] Low
- [ ] Medium
- [ ] High