nunchaku [Bug] ubuntu system get stuck after several times of inference running

Checklist

[x] 1. I have searched for related issues and FAQs (https://github.com/mit-han-lab/nunchaku/blob/main/docs/faq.md) but was unable to find a solution.
[x] 2. The issue persists in the latest version.
[x] 3. Please note that without environment information and a minimal reproducible example, it will be difficult for us to reproduce and address the issue, which may delay our response.
[ ] 4. If your report is a question rather than a bug, please submit it as a discussion at https://github.com/mit-han-lab/nunchaku/discussions/new/choose. Otherwise, this issue will be closed.
[ ] 5. If this is related to ComfyUI, please report it at https://github.com/mit-han-lab/ComfyUI-nunchaku/issues.
[x] 6. I will do my best to describe the issue in English.

Describe the Bug

I just install the 0.3.0.dev20250529+torch2.6 version locally from source code. first time, the inference(either offical wieght or my custom fluxgym lora weight) works fine.

After several inference times, just during the 3rd or 4th inference running, the system got stuck and i have to restart.

Whats more, the latest generated image will lost after restart.

Environment

os:ubuntu 24.4 x86/64 python 3.10 cuda 12.6 pytorch 2.6 nunchaku==0.3.0.dev20250529+torch2.6, build locally from github source code

Reproduction Steps

https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-lora.py https://huggingface.co/black-forest-labs/FLUX.1-dev

either above script will reproduce this problem.

May 30 '25 04:05 cvtower

Seems like that there might be memory leak after the inference demo, i guess this is the cause of stuck. and i cant check it by linux command e.g. ps.

May 30 '25 04:05 cvtower

I see. I will continue testing the memory issues. I will let you know then.

May 30 '25 04:05 lmxyy

A memory usage screenshot during 3 times inference script running uploaded.

Clearly, system memory usage gets more after each inference script run(model loaded).

May 30 '25 04:05 cvtower

The 4060ti 16GB Ubuntu system is fine, but the 4080 16GB Ubuntu system keeps increasing memory and then comfyui dies, or it is converted into a login window. and.. Currently, 4080 16GB is experiencing a cuda OOM error when choosing cpu offload as auto or disable for Ubuntu and Windows.

May 30 '25 08:05 dioscuri0860

Install the latest released version of nunchaku-0.3.0. This bug has been fixed on my Ubuntu24. Thanks for the great work of the developers. 🚀

Jun 02 '25 10:06 idonashino

Checklist

[x] 1. I have searched for related issues and FAQs (https://github.com/mit-han-lab/nunchaku/blob/main/docs/faq.md) but was unable to find a solution.[x] 2. The issue persists in the latest version.[x] 3. Please note that without environment information and a minimal reproducible example, it will be difficult for us to reproduce and address the issue, which may delay our response.[ ] 4. If your report is a question rather than a bug, please submit it as a discussion at https://github.com/mit-han-lab/nunchaku/discussions/new/choose. Otherwise, this issue will be closed.[ ] 5. If this is related to ComfyUI, please report it at https://github.com/mit-han-lab/ComfyUI-nunchaku/issues.[x] 6. I will do my best to describe the issue in English.

Describe the Bug

I just install the 0.3.0.dev20250529+torch2.6 version locally from source code. first time, the inference(either offical wieght or my custom fluxgym lora weight) works fine.

After several inference times, just during the 3rd or 4th inference running, the system got stuck and i have to restart.

Whats more, the latest generated image will lost after restart.

Environment

os:ubuntu 24.4 x86/64 python 3.10 cuda 12.6 pytorch 2.6 nunchaku==0.3.0.dev20250529+torch2.6, build locally from github source code

Reproduction Steps

https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-lora.py https://huggingface.co/black-forest-labs/FLUX.1-dev

either above script will reproduce this problem.

Was your issue fixed in the official v0.3.0?

Jun 02 '25 18:06 lmxyy

Install the latest released version of nunchaku-0.3.0. This bug has been fixed on my Ubuntu24. Thanks for the great work of the developers. 🚀

thanks for ur help! i will update and have a try later.

Jun 03 '25 05:06 cvtower

Checklist

[x] 1. I have searched for related issues and FAQs (https://github.com/mit-han-lab/nunchaku/blob/main/docs/faq.md) but was unable to find a solution.[x] 2. The issue persists in the latest version.[x] 3. Please note that without environment information and a minimal reproducible example, it will be difficult for us to reproduce and address the issue, which may delay our response.[ ] 4. If your report is a question rather than a bug, please submit it as a discussion at https://github.com/mit-han-lab/nunchaku/discussions/new/choose. Otherwise, this issue will be closed.[ ] 5. If this is related to ComfyUI, please report it at https://github.com/mit-han-lab/ComfyUI-nunchaku/issues.[x] 6. I will do my best to describe the issue in English.

Describe the Bug

I just install the 0.3.0.dev20250529+torch2.6 version locally from source code. first time, the inference(either offical wieght or my custom fluxgym lora weight) works fine. After several inference times, just during the 3rd or 4th inference running, the system got stuck and i have to restart. Whats more, the latest generated image will lost after restart.

Environment

os:ubuntu 24.4 x86/64 python 3.10 cuda 12.6 pytorch 2.6 nunchaku==0.3.0.dev20250529+torch2.6, build locally from github source code

Reproduction Steps

https://github.com/mit-han-lab/nunchaku/blob/main/examples/flux.1-dev-lora.py https://huggingface.co/black-forest-labs/FLUX.1-dev either above script will reproduce this problem.

Was your issue fixed in the official v0.3.0?

Yes.nunchaku==0.3.0.dev20250529

i will update to the version 0.3.0 released yesterday and have a try later.

Jun 03 '25 05:06 cvtower

after 3.0 latest wheel and ComfyUI 3.0 node update... flux1_fill_dev Error occurred. using nunchaku basic workflow. flux.1 dev works fine. CUDA error: out of memory (at C:\Users\muyangl\actions-runner_work\nunchaku\nunchaku\src\Tensor.h:95)

Jun 03 '25 06:06 dioscuri0860

1.The memory usage dropped dramatically after update to v0.3.0 release version compared to the 0.3.0.dev20250529 version. 2.With the the offical released v0.3.0 wheel, still get the import error. so i compile it from local source and it works fine.

I just usage the lux.1_dev for test.

Close this issue, anyone could open another issue if needed.

Thanks for all ur help.

Jun 03 '25 12:06 cvtower