LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

stablediffusion fails to install on DGX Spark

Open k3mist opened this issue 1 month ago • 3 comments

LocalAI version: sha-ef44ace-nvidia-l4t-arm64-cuda-13

v2.29.0-nvidia-l4t-arm64

Environment, CPU architecture, OS, and Version: docker

-> % lscpu
Architecture:                aarch64
  CPU op-mode(s):            64-bit
  Byte Order:                Little Endian
CPU(s):                      20
  On-line CPU(s) list:       0-19
Vendor ID:                   ARM
  Model name:                Cortex-X925
    Model:                   1
    Thread(s) per core:      1
    Core(s) per socket:      10
    Socket(s):               1
    Stepping:                r0p1
    CPU(s) scaling MHz:      90%
    CPU max MHz:             4004.0000
    CPU min MHz:             1378.0000
    BogoMIPS:                2000.00
    Flags:                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fc
                             ma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca 
                             pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16
                              i8mm bf16 dgh bti ecv afp wfxt
  Model name:                Cortex-A725
    Model:                   1
    Thread(s) per core:      1
    Core(s) per socket:      10
    Socket(s):               1
    Stepping:                r0p1
    CPU(s) scaling MHz:      86%
    CPU max MHz:             2860.0000
    CPU min MHz:             338.0000
    BogoMIPS:                2000.00
    Flags:                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fc
                             ma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca 
                             pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16
                              i8mm bf16 dgh bti ecv afp wfxt
-> % lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.3 LTS
Release:        24.04
Codename:       noble
-> % docker exec -it local-ai bash
root@5837be385699:/# nvidia-smi
Wed Dec 10 13:14:41 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GB10                    On  |   0000000F:01:00.0 Off |                  N/A |
| N/A   39C    P8              4W /  N/A  | Not Supported          |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

Describe the bug stablediffusion backend fails to install with;

Error installing backend "cuda13-nvidia-l4t-arm64-stablediffusion-ggml": not a valid backend: run file not found "/backends/cuda13-nvidia-l4t-arm64-stablediffusion-ggml/run.sh"

To Reproduce docker run -ti --name local-ai -p 32000:8080 --gpus all localai/localai:sha-ef44ace-nvidia-l4t-arm64-cuda-13

navigate to the local site > backends > try to install cuda13-nvidia-l4t-arm64-stablediffusion-ggml

Expected behavior the backend should install

Logs 1:08PM ERR Run file not found runFile=/backends/cuda13-nvidia-l4t-arm64-stablediffusion-ggml/run.sh 1:08PM ERR error installing backend localai@cuda13-nvidia-l4t-arm64-stablediffusion-ggml error="not a valid backend: run file not found "/backends/cuda13-nvidia-l4t-arm64-stablediffusion-ggml/run.sh""

Additional context

k3mist avatar Dec 10 '25 13:12 k3mist

cuda 13 support on DGX spark is in the works (but it's functional): you need to pull the corresponding development backend for now until there is a tagged release.

mudler avatar Dec 11 '25 07:12 mudler

cuda 13 support on DGX spark is in the works (but it's functional): you need to pull the corresponding development backend for now until there is a tagged release.

thank you! i will give that a shot

k3mist avatar Dec 11 '25 15:12 k3mist

that worked! however the generated images via dreamshaper are not at all what the prompt is lol. i suppose this is why its still in development. anyway i will periodically check back on progress. thanks again for replying

ive just begun poking around with stable diffusion so this could be a me issue if this should have worked?

"A cyberpunk cityscape with neon lights reflecting on rainy streets, people wearing futuristic clothing, ultra-detailed."

cuda13-nvidia-l4t-arm64-diffusers-development Image

cuda13-nvidia-l4t-arm64-stablediffusion-ggml-development Image

k3mist avatar Dec 11 '25 15:12 k3mist

things are much better now - can you give it a shot again with latest release? I've tried so far with qwen-image, but looking good so far.

The image that should be used for the DGX spark is: localai/localai:latest-nvidia-l4t-arm64-cuda-13

mudler avatar Dec 30 '25 21:12 mudler

im not getting any errors with qwen3-image but it appears to not be running at all and nothing of note in the logs.

i did have to set f16: false in the model config. it was complaining about no diffusion pipeline; after prompting however it just sits there and i dont see any load in nvtop or htop either.

maybe im missing something not sure.

however with dreamshaper i am seeing a significant improvement. im not sure why but the first attempt produced undesirable results. not as bad as my previous screenshot but not much better either.

nonetheless, the great news is my 2nd and 3rd attempt with dreamshaper produced something pretty awesome;

A cyberpunk cityscape with neon lights reflecting on rainy streets, people wearing futuristic clothing, ultra-detailed.

cuda13-nvidia-l4t-arm64-stablediffusion-ggml-development LocalAI Version v3.9.0 (aadec0b8cb2a7c608981823e7b8a003551662205)

Image Image

k3mist avatar Jan 01 '26 20:01 k3mist

i did another for good measure (and fun). really like this one;

Image

k3mist avatar Jan 01 '26 20:01 k3mist

im not getting any errors with qwen3-image but it appears to not be running at all and nothing of note in the logs.

Note that this could be just the model being downloaded in the background. In the case of qwen-image the first request will cause the model to download if not already present, and qwen image is quite big (IIRC around 40gb).

However great to hear that now dreamshaper works!

mudler avatar Jan 01 '26 23:01 mudler

Note that this could be just the model being downloaded in the background.

good call. tried again this morning

qwen-image Image

k3mist avatar Jan 02 '26 13:01 k3mist