Rejnald Lleshi comments

Results 74 comments of


                                            Rejnald Lleshi

Unreasonable high GPU usage when using erosion

@johnnv1 thanks for the quick answer. Unfortunately, that's only ~2G less GPU intensive.

bug: env variables not injected in bento container

Hi @aarnphm this seems to be happening when you define the `docker.env` field. So if you have a different set of variables on `envs` and you define another set of...

bug: env variables not injected in bento container

that would be helpful as I ended up wasting quite a bit of time on this

OOM-ing on Nvidia Jetson Orin Nano

Hi @mmathew23 , thanks for ur prompt response. Sorry about the missing info above. Here are more details. Gemma3 1B takes 1.8G of memory (via ollama) on the Jetson. I...

OOM-ing on Nvidia Jetson Orin Nano

Well, before the above-error message is thrown my RAM overflows, but you're right it doesn't throw the conventional OOM error. So if I just do ``` model, tokenizer = FastModel.from_pretrained(...

OOM-ing on Nvidia Jetson Orin Nano

Jetson Orin Nano doesn't have a dedicated VRAM GPU memory, but rather has a unified shared memory. So the RAM memory is used as VRAM memory and vice versa. Correct,...

OOM-ing on Nvidia Jetson Orin Nano

@mmathew23 kind reminder

OOM-ing on Nvidia Jetson Orin Nano

@mmathew23 `FastModel.from_pretrained()` is where it runs out of memory. Although this only happens if I'm loading my own lora weights. Here's the verbose stack trace that you asked for: ```...

OOM-ing on Nvidia Jetson Orin Nano

@mmathew23 just wondering if you have any other suggestions before we move on from this and fully commit to ollama.

OOM-ing on Nvidia Jetson Orin Nano

@mmathew23 We want to use unsloth for training & inference but if we cannot do inference with it then we're hoping to convert the models for OLLama inference (vllm was...