twinny icon indicating copy to clipboard operation
twinny copied to clipboard

Large files cause freeze when embedding

Open rjmacarthy opened this issue 1 year ago • 2 comments

Describe the bug Large files cause freeze when embedding

To Reproduce Embed a file > 500 lines or so.

Expected behavior It should work

Screenshots na

Logging

API Provider Ollama

Chat or Auto Complete? Embedding

Model Name All models

Desktop (please complete the following information):

  • OS: Ubuntu

Additional context Nothing else

rjmacarthy avatar Nov 12 '24 20:11 rjmacarthy

I was noticing this but I also am noticing that I can't seem to get Ollama to keep the embeddings model in memory. 4 mins into the embedding process the model disappears and the Twinny process hangs there. I thought it was large files but I'm now thinking its the model being unloaded. I have set the env variable to tell it to keep models in memory for 60m but that hasn't impacted it. (guessing I didn't set it properly).

gpinkham avatar Nov 26 '24 15:11 gpinkham

some additional debugging info in case it helps.
I ran ollama with a keep alive of 10h and debug turned on.. verified the model is in memory and ollama ps says its there for 10 hours.
I clicked on the "embed documents".. I see the twinny dialog in vscode.. it shows progress and then stops at 8.20%. the file its on is about 350 lines of Ruby code.

in the ollama debug I see this.

time=2024-12-02T22:32:56.359-05:00 level=DEBUG source=sched.go:575 msg="evaluating already loaded" model=/Users/gpinkham/.ollama/models/blobs/sha256-797b70c4edf85907fe0a49eb85811256f65fa0f7bf52166b147fd16be2be4662 time=2024-12-02T22:32:56.362-05:00 level=DEBUG source=runner.go:752 msg="embedding request" content="�\x02 REMOVED CONTENTS �\x01�\x01" time=2024-12-02T22:32:56.362-05:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=71 prompt=8 used=0 remaining=8 [GIN] 2024/12/02 - 22:32:56 | 200 | 12.461333ms | 127.0.0.1 | POST "/api/embed" time=2024-12-02T22:32:56.371-05:00 level=DEBUG source=sched.go:407 msg="context for request finished" time=2024-12-02T22:32:56.371-05:00 level=DEBUG source=sched.go:339 msg="runner with non-zero duration has gone idle, adding timer" modelPath=/Users/gpinkham/.ollama/models/blobs/sha256-797b70c4edf85907fe0a49eb85811256f65fa0f7bf52166b147fd16be2be4662 duration=10h0m0s time=2024-12-02T22:32:56.371-05:00 level=DEBUG source=sched.go:357 msg="after processing request finished event" modelPath=/Users/gpinkham/.ollama/models/blobs/sha256-797b70c4edf85907fe0a49eb85811256f65fa0f7bf52166b147fd16be2be4662 refCount=0

10 minutes later the twinny dialog has not moved beyond that file and the 8.20%. and the model is still in memory (per ollama ps) so its not ollama unloading the model causing the hang in twinny.

I also stopped the ollama server and the twinny dialog is still running. (bar moving across the bottom of the dialog but percentage still not changing)

gpinkham avatar Dec 03 '24 03:12 gpinkham