Oleg Klimov comments

Results 50 comments of


                                            Oleg Klimov

VRAM memory leak for Refact.AI 1.6B

I'll try to reproduce

VRAM memory leak for Refact.AI 1.6B

I left 1.6b (regular backend) for a day, memory settled on 6.19 Gb of memory RAM. I additionally sent 750 completion requests today and it's still 6.19 GB. I don't...

VRAM memory leak for Refact.AI 1.6B

Called for help from @mitya52

VRAM memory leak for Refact.AI 1.6B

hmm now I see 11.9Gb on my setup 🤔

VRAM memory leak for Refact.AI 1.6B

Cool!

Handle OOM better on smaller/older GPUs, or bigger models on regular GPUs

It doesn't say "out of memory" for you. 🤔 Not sure how to debug this. @bonswouar what GPU do you have?

Plugin in PyCharm and local model in Windows.

Yes we want CPU support, and a small inference server code without much dependencies would be great. The current work is in #77

Plugin in PyCharm and local model in Windows.

We'll actually solve this! New plugins with a rust binary will use standard API. (HF or OpenAI style)

EPIC: Run Self-hosted version on CPU

Hi @octopusx We tested various models on CPU, it's about 4-8 seconds for a single code completion, even for 1.6b or a starcoder 1b, on Apple M1 hardware. Maybe we'll...

EPIC: Run Self-hosted version on CPU

Ah I see, that makes total sense. I think the best way to solve this is to add providers to the rust layer, for the new plugins. We'll release the...