bert.cpp
bert.cpp copied to clipboard
converter does not work with the current ggml
Tried to convert https://huggingface.co/intfloat/e5-large-v2
to ggml with the current d9f04e609fb7f7e5fb3b20a77d4d685219971009
commit. However, execution of the converted f32, f16, q4_0, and q4_1 models shows the not enough space in the context's memory pool
message. Maybe it is related to https://github.com/ggerganov/ggml/issues/158 ?
I ran into the same issue, but after making these changes it works fine. https://github.com/skeskinen/bert.cpp/commit/007f0630705c6aa8bd53c04f15a5dd607bd0bddd
Thanks! I think latest ggml with your increasing memory size code can be used to convert the models. I believe the code in this repository should be updated.
- model_mem_req += (5 + 16 * n_layer) * 256; // object overhead
+ model_mem_req += (5 + 16 * n_layer) * 512; // object overhead
- new_bert->buf_compute.resize(16 * 1024 * 1024);
+ new_bert->buf_compute.resize(32 * 1024 * 1024);
I see that this has been updated: https://github.com/skeskinen/bert.cpp/blob/master/bert.cpp#L461 But I am still seeing this error.
bert_load_from_file: n_vocab = 30522
bert_load_from_file: n_max_tokens = 512
bert_load_from_file: n_embd = 384
bert_load_from_file: n_intermediate = 1536
bert_load_from_file: n_head = 12
bert_load_from_file: n_layer = 12
bert_load_from_file: f16 = 0
bert_load_from_file: ggml ctx size = 126.80 MB
bert_load_from_file: ........................ done
bert_load_from_file: model size = 126.69 MB / num tensors = 197
bert_load_from_file: mem_per_token 898 KB, mem_per_input 538 MB
ggml_new_object: not enough space in the context's memory pool (needed 565700096, available 565174272)