llama.cpp
llama.cpp copied to clipboard
Add disk space requirements to README.md
Add disk space requirements from https://cocktailpeanut.github.io/dalai/#/?id=_7b, as suggested in #195.
The size for the 65B Model is not quite right. Its more like:
8x15,2 GiB = 121,6 GiB Original Model
8x15.2 GiB = 121,6 GiB GGML
8x4.75 GiB = 38 GiB Quantized
281,1 GiB Total
The other sizes are also not quite right, because the full size is already including the quantized size.
7B:
12,5 GiB Original Model
12,5 GiB GGML
3,92 GiB Quantized
28,92 GiB Total
I don't have the data for 13B.
30B:
4x15,2 GiB = 60,8 GiB Original Model
4x15.2 GiB = 60,8 GiB GGML
4x4.75 GiB = 19 GiB Quantized
140,6 GiB Total
$ find . -type d -exec du -hs {} \; | sort -h
30G ./7B
57G ./13B
141G ./30B
282G ./65B
507G .
$ find . -type f -exec ls -sh {} \; | grep G | sort -rk 2,2
4.0G ./7B/ggml-model-q4_0.bin
13G ./7B/ggml-model-f16.bin
13G ./7B/consolidated.00.pth
4.8G ./65B/ggml-model-q4_0.bin.7
4.8G ./65B/ggml-model-q4_0.bin.6
4.8G ./65B/ggml-model-q4_0.bin.5
4.8G ./65B/ggml-model-q4_0.bin.4
4.8G ./65B/ggml-model-q4_0.bin.3
4.8G ./65B/ggml-model-q4_0.bin.2
4.8G ./65B/ggml-model-q4_0.bin.1
4.8G ./65B/ggml-model-q4_0.bin
16G ./65B/ggml-model-f16.bin.7
16G ./65B/ggml-model-f16.bin.6
16G ./65B/ggml-model-f16.bin.5
16G ./65B/ggml-model-f16.bin.4
16G ./65B/ggml-model-f16.bin.3
16G ./65B/ggml-model-f16.bin.2
16G ./65B/ggml-model-f16.bin.1
16G ./65B/ggml-model-f16.bin
16G ./65B/consolidated.07.pth
16G ./65B/consolidated.06.pth
16G ./65B/consolidated.05.pth
16G ./65B/consolidated.04.pth
16G ./65B/consolidated.03.pth
16G ./65B/consolidated.02.pth
16G ./65B/consolidated.01.pth
16G ./65B/consolidated.00.pth
4.8G ./30B/ggml-model-q4_0.bin.3
4.8G ./30B/ggml-model-q4_0.bin.2
4.8G ./30B/ggml-model-q4_0.bin.1
4.8G ./30B/ggml-model-q4_0.bin
16G ./30B/ggml-model-f16.bin.3
16G ./30B/ggml-model-f16.bin.2
16G ./30B/ggml-model-f16.bin.1
16G ./30B/ggml-model-f16.bin
16G ./30B/consolidated.03.pth
16G ./30B/consolidated.02.pth
16G ./30B/consolidated.01.pth
16G ./30B/consolidated.00.pth
3.8G ./13B/ggml-model-q4_0.bin.1
3.8G ./13B/ggml-model-q4_0.bin
13G ./13B/ggml-model-f16.bin.1
13G ./13B/ggml-model-f16.bin
13G ./13B/consolidated.01.pth
13G ./13B/consolidated.00.pth
Requirements added in https://github.com/ggerganov/llama.cpp/pull/269
Closing this one since it had wrong sizes anyway.