llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Add disk space requirements to README.md

Open G3zz opened this issue 1 year ago • 2 comments

Add disk space requirements from https://cocktailpeanut.github.io/dalai/#/?id=_7b, as suggested in #195.

G3zz avatar Mar 16 '23 09:03 G3zz

The size for the 65B Model is not quite right. Its more like:

8x15,2 GiB = 121,6 GiB Original Model
8x15.2 GiB = 121,6 GiB GGML             
8x4.75 GiB = 38 GiB Quantized        
281,1 GiB Total          

The other sizes are also not quite right, because the full size is already including the quantized size.

7B:

12,5 GiB Original Model
12,5 GiB GGML             
3,92 GiB Quantized        
28,92 GiB Total          

I don't have the data for 13B.

30B:

4x15,2 GiB = 60,8 GiB Original Model
4x15.2 GiB = 60,8 GiB GGML             
4x4.75 GiB = 19 GiB Quantized        
140,6 GiB Total          

cyloa avatar Mar 16 '23 09:03 cyloa

$ find . -type d -exec du -hs {} \; | sort -h
30G	./7B
57G	./13B
141G	./30B
282G	./65B
507G	.
$ find . -type f -exec ls -sh {} \; | grep G | sort -rk 2,2
4.0G ./7B/ggml-model-q4_0.bin
13G ./7B/ggml-model-f16.bin
13G ./7B/consolidated.00.pth
4.8G ./65B/ggml-model-q4_0.bin.7
4.8G ./65B/ggml-model-q4_0.bin.6
4.8G ./65B/ggml-model-q4_0.bin.5
4.8G ./65B/ggml-model-q4_0.bin.4
4.8G ./65B/ggml-model-q4_0.bin.3
4.8G ./65B/ggml-model-q4_0.bin.2
4.8G ./65B/ggml-model-q4_0.bin.1
4.8G ./65B/ggml-model-q4_0.bin
16G ./65B/ggml-model-f16.bin.7
16G ./65B/ggml-model-f16.bin.6
16G ./65B/ggml-model-f16.bin.5
16G ./65B/ggml-model-f16.bin.4
16G ./65B/ggml-model-f16.bin.3
16G ./65B/ggml-model-f16.bin.2
16G ./65B/ggml-model-f16.bin.1
16G ./65B/ggml-model-f16.bin
16G ./65B/consolidated.07.pth
16G ./65B/consolidated.06.pth
16G ./65B/consolidated.05.pth
16G ./65B/consolidated.04.pth
16G ./65B/consolidated.03.pth
16G ./65B/consolidated.02.pth
16G ./65B/consolidated.01.pth
16G ./65B/consolidated.00.pth
4.8G ./30B/ggml-model-q4_0.bin.3
4.8G ./30B/ggml-model-q4_0.bin.2
4.8G ./30B/ggml-model-q4_0.bin.1
4.8G ./30B/ggml-model-q4_0.bin
16G ./30B/ggml-model-f16.bin.3
16G ./30B/ggml-model-f16.bin.2
16G ./30B/ggml-model-f16.bin.1
16G ./30B/ggml-model-f16.bin
16G ./30B/consolidated.03.pth
16G ./30B/consolidated.02.pth
16G ./30B/consolidated.01.pth
16G ./30B/consolidated.00.pth
3.8G ./13B/ggml-model-q4_0.bin.1
3.8G ./13B/ggml-model-q4_0.bin
13G ./13B/ggml-model-f16.bin.1
13G ./13B/ggml-model-f16.bin
13G ./13B/consolidated.01.pth
13G ./13B/consolidated.00.pth

gjmulder avatar Mar 16 '23 12:03 gjmulder

Requirements added in https://github.com/ggerganov/llama.cpp/pull/269

Closing this one since it had wrong sizes anyway.

prusnak avatar Mar 18 '23 21:03 prusnak