llama.cpp
llama.cpp copied to clipboard
Streaming conversion with no torch
Drop torch, do not load whole file into memory, process files in parallel and use separate threads for r/w
The python dependencies in .devops/full.Dockerfile should also be updated, will conflict with my PR #293.
This looks like a very useful addition. Lets give it a priority and merge after resolving the conflicts
@ggerganov Any update on this? Because I really do not want to install pytorch on my system (because of memory).
This is probably too outdated so closing for now