ggml Support for bigcode/starcoder

Hi! I saw the example for the bigcode/gpt_bigcode-santacoder model. I am wondering how I can run the bigcode/starcoder model on CPU with a similar approach.

When I run the following command:

python examples/starcoder/convert-hf-to-ggml.py bigcode/starcoder

I encountered this error:

OSError: Consistency check failed: file should be of size 9904379239 but has size 2282030570 ((…)l-00001-of-00007.bin).

Any ideas or help would be greatly appreciated.

May 23 '23 05:05 seyyedaliayati

%  git diff
diff --git a/examples/starcoder/main.cpp b/examples/starcoder/main.cpp
index c9d1d7e..1972732 100644
--- a/examples/starcoder/main.cpp
+++ b/examples/starcoder/main.cpp
@@ -18,11 +18,11 @@
 // https://huggingface.co/bigcode/gpt_bigcode-santacoder/blob/main/config.json
 struct starcoder_hparams {
     int32_t n_vocab = 49280;
-    int32_t n_ctx   = 2048;
-    int32_t n_embd  = 2048;
-    int32_t n_head  = 16;
-    int32_t n_layer = 24;
-    int32_t ftype   = 1;
+  int32_t n_ctx   = 8192;
+  int32_t n_embd  = 6144;
+  int32_t n_head  = 48;
+  int32_t n_layer = 40;
+  int32_t ftype   = 1;
 };
 
 struct starcoder_layer {
diff --git a/examples/starcoder/quantize.cpp b/examples/starcoder/quantize.cpp
index 101af50..09111bb 100644
--- a/examples/starcoder/quantize.cpp
+++ b/examples/starcoder/quantize.cpp
@@ -16,11 +16,11 @@
 // default hparams (GPT-2 117M)
 struct starcoder_hparams {
     int32_t n_vocab = 49280;
-    int32_t n_ctx   = 2048;
-    int32_t n_embd  = 2048;
-    int32_t n_head  = 16;
-    int32_t n_layer = 24;
-    int32_t ftype   = 1;
+  int32_t n_ctx   = 8192;
+  int32_t n_embd  = 6144;
+  int32_t n_head  = 48;
+  int32_t n_layer = 40;
+  int32_t ftype   = 1;
 };
 
 // quantize a model

Converted and quantized models can be found here: https://huggingface.co/NeoDim/starcoder-GGML https://huggingface.co/NeoDim/starcoderbase-GGML https://huggingface.co/NeoDim/starchat-alpha-GGML

May 23 '23 21:05 s-kostyaev

@s-kostyaev I don't think you need this patch - the correct parameters are loaded from the model file

May 24 '23 08:05 ggerganov

@ggerganov Ok. Why it doesn't work for @seyyedaliayati ?

May 24 '23 09:05 s-kostyaev

@ggerganov you are right. Without patch all works fine. Thank you for information.

May 24 '23 13:05 s-kostyaev

he correct parameters are loaded from the model file

So, why I get this error? If you need more information, please let me know. Thanks.

May 24 '23 14:05 seyyedaliayati

I just re-run and now I get this:

(base) ali@host:~/ggml$ python examples/starcoder/convert-hf-to-ggml.py bigcode/starcoder
Loading model:  bigcode/starcoder
Downloading (…)l-00001-of-00007.bin: 100%|█████████████████████████████████████████| 9.90G/9.90G [05:09<00:00, 32.0MB/s]
Downloading (…)l-00002-of-00007.bin: 100%|█████████████████████████████████████████| 9.86G/9.86G [04:51<00:00, 33.8MB/s]
Downloading (…)l-00003-of-00007.bin: 100%|█████████████████████████████████████████| 9.85G/9.85G [04:59<00:00, 32.9MB/s]
Downloading (…)l-00004-of-00007.bin: 100%|█████████████████████████████████████████| 9.86G/9.86G [04:53<00:00, 33.6MB/s]
Downloading (…)l-00005-of-00007.bin: 100%|█████████████████████████████████████████| 9.85G/9.85G [04:55<00:00, 33.3MB/s]
Downloading (…)l-00006-of-00007.bin: 100%|█████████████████████████████████████████| 9.86G/9.86G [04:58<00:00, 33.1MB/s]
Downloading (…)l-00007-of-00007.bin: 100%|█████████████████████████████████████████| 4.08G/4.08G [01:56<00:00, 34.9MB/s]
Downloading shards: 100%|████████████████████████████████████████████████████████████████| 7/7 [31:46<00:00, 272.30s/it]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 7/7 [03:42<00:00, 31.77s/it]
Killed
(base) ali@host:~/ggml$ python examples/starcoder/convert-hf-to-ggml.py bigcode/starcoder
Loading model:  bigcode/starcoder
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 7/7 [03:38<00:00, 31.20s/it]
Killed

Is it because I am running on WSL?

May 24 '23 18:05 seyyedaliayati

I'm afraid you may be lacking system memory for the conversion.

May 25 '23 21:05 jaeminSon

I'm afraid you may be lacking system memory for the conversion.

You are right. I have increased my RAM and issue solved!

Jun 12 '23 20:06 seyyedaliayati

ggml ggml copied to clipboard

Support for bigcode/starcoder

ggml
ggml copied to clipboard