apaz
apaz
I'm getting ready to take another swing at it. My idea of what to do so far: 1. Create functions in `utils.h` called `llama_load_buffer()`, `llama_save_buffer()`, and `llama_destroy_buffer()`. These will `mmap()`...
@jart It would double the disk usage, yes. But so does converting the model, and so does quantizing it. I think people are prepared for this. You're right though in...
@jart I have no idea how to support that in a portable way. I haven't dug too deep into it. I'm halfway through implementing part 1. The troubling thing is...
@jart I'm more lamenting at the absurdity that there's no portable (C++11) way to find the size of a file. It truly baffles me. On posix there's `fstat`. On Windows...
The `mmap()`/`mlock()` changes in llama.cpp should be applicable here.
It would be best to take this up on https://github.com/ggerganov/ggml.
@oKatanaaa switching between `std::ifstream` and `FILE*` should make no measurable difference. They are both tunable, do conceptually the exact same thing, and support (almost) exactly the same set of operations....
@oKatanaaa The branch is already in the repo. Just `git pull origin` and `git checkout mmap`.
Any updates or feedback on this @ggerganov?
Do we have any broader ideas for how this fits into the strategy for handling dynamic and data dependent shapes? I was under the impression that this was just something...