llm-rs-python
llm-rs-python copied to clipboard
How much RAM needed to convert gpt2 13b model to ggml using your Manual convert function?
I'm trying to convert it on 16gb RAM but converting process seems to last forever.
Well you can calculate it via: 13b times 16 Bit (f16) = 26 GB. Accelerate will probably try to page some of the layers, if you exceed your 16 GB and get stuck there. Theoretically it's possible to stream the layers in, but i think neither GGML or this project has implemented that yet for GPT 2.