Unreachable
- Start LLM
- Close laptop
- Sleep 8 hours
- Open laptop
- Issue command to LLM
Probably it's an issue of the browser (equivalent to segfault if run in native)
I think the browser may be clearing the blobs from the memory when the tab gets suspended (after some time not being used).
I just noticed this one again. This time on an Android mobile phone (Pixel 6a, Chrome), with just one browser tab open, and everything else closed manually.
I was trying to load a Gemma 2 2B it model.
https://huggingface.co/BoscoTheDog/gemma_2_2b_it_Q4_gguf_chunked
Context is set to 1K, the model is 1.63GB, and the Pixel has 6GB of RAM. According to the OS my average memory use is 3GB.
I think the browser may be clearing the blobs from the memory when the tab gets suspended
I don't think that's the case here, as the tab is the currently active one. Maybe it's just a lack of memory issue? Or maybe like on mobile Safari there's a limit to how much RAM a tab may use?
I tried to load another 1.6GB (Bitnet) model on the phone, and that did load. Hmm.
I'll do a quick git clone --recurse-submodules https://github.com/ngxson/wllama.git; cd wllama; git submodule update --remote --merge; npm i; npm run build:wasm; npm run build.
// Nice, Phi 3.1 mini loads and (very slowly) generates a response. It's 2.1GB.
// Updating llama.cpp solved it.