bert.cpp icon indicating copy to clipboard operation
bert.cpp copied to clipboard

About the calculation of overhead.

Open znsoftm opened this issue 2 years ago • 4 comments

https://github.com/ggerganov/ggml/issues/356

znsoftm avatar Jul 10 '23 02:07 znsoftm

or BERT mode, its overhead is calculated as :

model_mem_req += (5 + 16 * n_layer) * 256; // object overhead

Can anyone explain the meaning 5 is extra tensors, 16 means each layer has 16 tensor, and 256 for what?

Is it the sizeof ggml_tensor struct ? The actual size is 208 bytes, so 256 is rounded size?

znsoftm avatar Jul 10 '23 02:07 znsoftm

My memory is a little hazy on this subject. Like you said 5 should be the extra model wise tensors not tied to any layer. I think I tried smaller number than 256 for the size but it crashed with OOM. Probably the real size of C structs is always rounded up to the next power of 2?

skeskinen avatar Jul 10 '23 08:07 skeskinen

thanks for your answer:)

znsoftm avatar Jul 10 '23 23:07 znsoftm

I have tested the latest ggml, should alter the 256 to 512. Do not understand why:(

znsoftm avatar Jul 11 '23 22:07 znsoftm