Oleg Klimov
Oleg Klimov
I have M1 in my MacBook Air, I've tested the smallest reasonable models, for example 1b starcoder running llama.cpp: ``` "-m", "starcoder-1b-q8_0.gguf" 897.71 ms / 557 tokens ( 1.61 ms...
Yes, just give it a prompt of 1000 tokens. Here you can try my script: ``` code = """import pygame import numpy as np import attractgame_particle W = 640 H...
Hey @domdorn thanks for your results! That should be less than a second for a 2k context, maybe we'll think about official support 🤔 > refact locally on my mac...
I like this idea, also it can ask for a name at a finetune start
whoops, that's clearly a problem
Hi @st01cs ! It's strange that chat works, and code completion doesn't. Both go via the same connection. Port is chosen randomly over there: https://github.com/smallcloudai/refact-vscode/blob/shared-chat-lib/src/launchRust.ts#L78 One thing you can try...
Hey @ukrolelo thanks for reporting, can you pls check if you can find anything suspicious in the lsp logs, for example a panic: cat ~/.cache/refact/logs/rustbinary.2024-06-28 | grep Panic
I looked at the logs, it appears the GPU is super slow, 20240627 09:11:32 WEBUI comp-533320dc437d 20240627 09:12:02 WEBUI TIMEOUT comp-533320dc437d 20240627 09:12:33 WEBUI 60954.0ms comp-533320dc437d result arrived too late...
Okay we need to reproduce this :/ @hazratisulton
Thanks for reporting. I don't think we do anything that can cause memory leaks. Hmm maybe it's the torch version or cuda version or something like this 🤔