stable-diffusion.cpp icon indicating copy to clipboard operation
stable-diffusion.cpp copied to clipboard

ggml-metal.m - GGML_ASSERT: ne00 % 4 == 0 when generating images of dimensions 640x640

Open phudtran opened this issue 1 year ago • 4 comments

Seems to be an issue with group_norm on metal, haven't tried with other backends.

./bin/sd -m models/gsdf/Counterfeit-V2.5/Counterfeit-V2.5_pruned.safetensors -p "a cat" --steps 2 -H 640 -W 640
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Max
ggml_metal_init: picking default device: Apple M2 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading 'stable-diffusion.cpp/build/bin/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M2 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple8  (1008)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
[INFO ] stable-diffusion.cpp:142  - loading model from 'models/gsdf/Counterfeit-V2.5/Counterfeit-V2.5_pruned.safetensors'
[INFO ] model.cpp:676  - load models/gsdf/Counterfeit-V2.5/Counterfeit-V2.5_pruned.safetensors using safetensors format
[INFO ] stable-diffusion.cpp:164  - Stable Diffusion 1.x
[INFO ] stable-diffusion.cpp:170  - Stable Diffusion weight type: f32
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   469.45 MiB, (  471.33 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  2155.34 MiB, ( 2626.67 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    94.47 MiB, ( 2721.14 / 21845.34)
[INFO ] stable-diffusion.cpp:306  - total params memory size = 1408.32MB (clip 469.44MB, unet 2155.33MB, vae 94.47MB, controlnet 0.00MB)
[INFO ] stable-diffusion.cpp:310  - loading model from 'models/gsdf/Counterfeit-V2.5/Counterfeit-V2.5_pruned.safetensors' completed, taking 1.03s
[INFO ] stable-diffusion.cpp:327  - running in eps-prediction mode
[INFO ] stable-diffusion.cpp:1374 - apply_loras completed, taking 0.00s
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     1.41 MiB, ( 2722.55 / 21845.34)
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     1.41 MiB, ( 2722.55 / 21845.34)
[INFO ] stable-diffusion.cpp:1413 - get_learned_condition completed, taking 69 ms
[INFO ] stable-diffusion.cpp:1429 - sampling using Euler A method
[INFO ] stable-diffusion.cpp:1433 - generating image: 1/1 - seed 42
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  1320.61 MiB, ( 3572.30 / 21845.34)
GGML_ASSERT: stable-diffusion.cpp/ggml/src/ggml-metal.m:2034: ne00 % 4 == 0
GGML_ASSERT: stable-diffusion.cpp/ggml/src/ggml-metal.m:2034: ne00 % 4 == 0
zsh: abort      ./bin/sd -m  -p "a cat" --steps 2 -H 640 -W 640

phudtran avatar Mar 06 '24 19:03 phudtran

Also asserts for all square image dimensions except for 512x512, 768x768, and 1024x1024 ( haven't tested past 1024x1024).

phudtran avatar Mar 06 '24 19:03 phudtran

This seems to be an issue with the implementation of the ggml Metal backend. You can try removing the corresponding assets to see if the issue persists.

leejet avatar Mar 10 '24 09:03 leejet

Same assert for me, but on v2-1_768-ema-pruned.safetensors model on M1 Pro. No image dimensions work.

Any suggestions on workaround/fix?

smasyutin avatar Apr 14 '24 11:04 smasyutin

Same issue at all dimensions, after commenting out the assert line in ggml-metal.m it just works with no noticeable difference.

remixer-dec avatar May 11 '24 16:05 remixer-dec