clip.cpp icon indicating copy to clipboard operation
clip.cpp copied to clipboard

not enough space in the context's memory pool (on Apple M1 Max, 32GB RAM, clip-vit-b-32)

Open dukeeagle opened this issue 1 year ago • 6 comments

Hi there,

Thank you so much for making this library. I'm unfortunately running into the following error

./main --model '/Users/lucasigel/Downloads/laion_clip-vit-b-32-laion2b-s34b-b79k.ggmlv0.q4_0.bin'  --text "test" --image '/00000002.jpg' -v 1

clip_model_load: loading model from '/Users/lucasigel/Downloads/laion_clip-vit-b-32-laion2b-s34b-b79k.ggmlv0.q4_0.bin' - please wait....................................................clip_model_load: model size =    85.06 MB / num tensors = 397
clip_model_load: model loaded

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 12051936, available 8388608)
Assertion failed: (false), function ggml_new_tensor_impl, file ggml.c, line 4449.

zsh: abort      ./main --model  --text "test" --image  -v 1

I'm running on a Mac Studio with M1 Max and 32 GB of RAM. I tried every available model binary on huggingface and still got the same memory pool error. Is this due to a memory allocation bug? I see in #17 that this got solved for some cases and I'm wondering if there are lingering issues here

dukeeagle avatar Jul 09 '23 21:07 dukeeagle

Barely missing the threshold on openai_clip-vit-base-patch16.ggmlv0.f16.bin! Can we significantly reduce the minimum memory pool size? Is this just a bug that's massively inflating the minimum? I'd like to run clip.cpp on far less powerful devices than my Mac Studio if that's possible

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 17471536, available 16777216)
Assertion failed: (false), function ggml_new_tensor_impl, file ggml.c, line 4449.
zsh: abort      ./main --model  --text "test" --image  -v 1

dukeeagle avatar Jul 09 '23 21:07 dukeeagle

It requests ~12 mb instead of 8mb that I set as a fixed value here. https://github.com/monatis/clip.cpp/blob/e2eee8e9b11afe4fc9fdb22d1f6d0ea53df9552a/clip.cpp#L24-L30

You can slightly increase them --8 is for patch32 and 16 is for patch16, so adjust them to a value that is working for you. Interestingly, It works for me with these values on Windows and Linux but haven't tried with Macbook yet. Additionally, quantized models may require slightly more memory. I'll try to replicate it tomorrow.

I'd like to run clip.cpp on far less powerful devices

What kind of devices are you targeting? I'm quite interested in new use cases and low-end devices, so we can work on it anyway

monatis avatar Jul 09 '23 22:07 monatis

That worked! Thank you so much for the quick reply.

I want to run on Intel-era Macbook Pros and Airs, like a 13" Macbook Air 2019. Not very low-end in the grand scheme of things haha

dukeeagle avatar Jul 09 '23 22:07 dukeeagle

As an aside, have you tried converting these models to CoreML like they do in whisper.cpp?

dukeeagle avatar Jul 09 '23 22:07 dukeeagle

That worked!

That's great! I'll try to find the root cause of this difference and patch it later on.

Not very low-end in the grand scheme of things

Hahha yes. They should do a fairly good job.

have you tried converting these models to CoreML

Not yet, but good point. I'd like to support additional deployment types as we find different use cases for clip.cpp.

monatis avatar Jul 09 '23 22:07 monatis

Really appreciate the quick replies here. Have you also considered building out a version of this for BLIP or other more recent CLIP variants? Currently exploring the steps involved. Large-scale image retrieval has worked far better on BLIP and BLIP2 but of course they take way more time and memory

dukeeagle avatar Jul 09 '23 23:07 dukeeagle