Fix clippy shadowing
Should reduce memory usage and hopefully increase speed...
Code Metrics Report
=============================================================================== Language Files Lines Code Comments Blanks =============================================================================== Dockerfile 1 34 25 0 9 Happy 1 442 369 0 73 JSON 9 21 21 0 0 Python 31 1217 1038 37 142 TOML 16 440 400 1 39 ------------------------------------------------------------------------------- Jupyter Notebooks 1 0 0 0 0 |- Markdown 1 60 30 22 8 |- Python 1 96 87 1 8 (Total) 156 117 23 16 ------------------------------------------------------------------------------- Markdown 16 1149 0 846 303 |- BASH 5 100 97 0 3 |- Python 6 122 110 0 12 |- Rust 2 80 72 3 5 (Total) 1451 279 849 323 ------------------------------------------------------------------------------- Rust 115 34412 31161 585 2666 |- Markdown 57 641 13 594 34 (Total) 35053 31174 1179 2700 =============================================================================== Total 191 37715 33014 1469 3232 ===============================================================================
@chenwanqq I cannot measure any T/s speedup. I think that because we are using a GPU and the commands are async, the Rust drop code runs quickly enough that there is no difference.
I implemented this change only for the models/quantized_llama.rs code.
@chenwanqq I cannot measure any T/s speedup. I think that because we are using a GPU and the commands are async, the Rust drop code runs quickly enough that there is no difference.
I implemented this change only for the
models/quantized_llama.rscode.
I think the problem might not be about speed, but about peak memory usage.🧐 For instance, whether it can run a model within limited memory space or how many tokens it can process for a given model.