unsupported op 'IM2COL_3D' on Mac
I tried to run Wan2.2-TI2V-5B on a Mac Mini, it hits an assert
ggml/src/ggml-metal/ggml-metal.m:2068: unsupported op 'IM2COL_3D'
is there any way to avoid this problem? I could run Flux dev successfully, by the way
I believe this is due to IM2COL_3D not being implemented for Metal in GGML yet, similar to what was mentioned in #822 . I'm not sure if it's being worked on for Metal, though, even though llama.cpp has implemented it for Vulkan and at least that should be coming whenever GGML is updated for stable-diffusion.cpp (waiting on it myself).
Great! @MrSnichovitch - do you know what it takes to point to the latest ggml?
For reference, here is the stack trace I get:
* frame #0: 0x000000019b7de388 libsystem_kernel.dylib`__pthread_kill + 8
frame #1: 0x000000019b81788c libsystem_pthread.dylib`pthread_kill + 296
frame #2: 0x000000019b720a3c libsystem_c.dylib`abort + 124
frame #3: 0x00000001001bd730 sd`ggml_abort + 160
frame #4: 0x00000001001bad44 sd`ggml_metal_encode_node + 27288
frame #5: 0x00000001001b4218 sd`__ggml_backend_metal_set_n_cb_block_invoke + 596
frame #6: 0x00000001001b3cb0 sd`ggml_backend_metal_graph_compute + 368
frame #7: 0x00000001001d3684 sd`ggml_backend_graph_compute + 32
frame #8: 0x000000010009c4bc sd`GGMLRunner::compute(std::__1::function<ggml_cgraph* ()>, int, bool, ggml_tensor**, ggml_context*) + 648
frame #9: 0x00000001000bb694 sd`WanModel::compute(int, DiffusionParams, ggml_tensor**, ggml_context*) + 204
frame #10: 0x000000010010cf7c sd`StableDiffusionGGML::sample(ggml_context*, std::__1::shared_ptr<DiffusionModel>, bool, ggml_tensor*, ggml_tensor*, SDCondition, SDCondition, SDCondition, ggml_tensor*, float, sd_guidance_params_t, float, sample_method_t, std::__1::vector<float, std::__1::allocator<float>> const&, int, SDCondition, std::__1::vector<ggml_tensor*, std::__1::allocator<ggml_tensor*>>, bool, ggml_tensor*, ggml_tensor*, float)::'lambda'(ggml_tensor*, float, int)::operator()(ggml_tensor*, float, int) const + 1308
frame #11: 0x000000010007129c sd`StableDiffusionGGML::sample(ggml_context*, std::__1::shared_ptr<DiffusionModel>, bool, ggml_tensor*, ggml_tensor*, SDCondition, SDCondition, SDCondition, ggml_tensor*, float, sd_guidance_params_t, float, sample_method_t, std::__1::vector<float, std::__1::allocator<float>> const&, int, SDCondition, std::__1::vector<ggml_tensor*, std::__1::allocator<ggml_tensor*>>, bool, ggml_tensor*, ggml_tensor*, float) + 3192
frame #12: 0x000000010007785c sd`generate_video + 7096
frame #13: 0x000000010000c2e4 sd`main + 3272
frame #14: 0x000000019b476b98 dyld`start + 6076
I patched the change mentioned in #822 locally and tried to use the Vulkan backend, now I hit a different error
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x4a)
* frame #0: 0x000000010022b794 sd`ggml_vk_build_graph(ggml_backend_vk_context*, ggml_cgraph*, int, ggml_tensor*, int, bool, bool, bool, bool) + 280
frame #1: 0x0000000100225668 sd`ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) + 356
frame #2: 0x0000000100269d84 sd`ggml_backend_graph_compute + 32
frame #3: 0x000000010009bcdc sd`GGMLRunner::compute(std::__1::function<ggml_cgraph* ()>, int, bool, ggml_tensor**, ggml_context*) + 648
frame #4: 0x00000001000ab3fc sd`T5CLIPEmbedder::get_learned_condition_common(ggml_context*, int, std::__1::tuple<std::__1::vector<int, std::__1::allocator<int>>, std::__1::vector<float, std::__1::allocator<float>>, std::__1::vector<float, std::__1::allocator<float>>>, int, bool) + 596
frame #5: 0x00000001000aaa6c sd`T5CLIPEmbedder::get_learned_condition(ggml_context*, int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, int, int, int, int, bool) + 168
frame #6: 0x00000001000769e0 sd`generate_video + 5596
frame #7: 0x000000010000ba84 sd`main + 3272
frame #8: 0x000000019b476b98 dyld`start + 6076
You should note that I barely know what I'm doing here, so I'm hoping someone with more programming experience can chime in. Can MacOS use Vulkan? I thought Metal was its near-neighbor equivalent.
Anyway, I can tell you I only sort of managed to get Vulkan on Linux working by downloading llama.cpp, then completely overwriting the stable-diffusion.cpp ggml folder with llama.cpp's and compiling. It was the only way to not have to chase down errors in the header files during compilation.
Have to run sd with the --offload-to-cpu and --vae-on-cpu options to get it to output anything, but the output's an incoherent mess with the WAN 2.2 TI2V 5B Q8_0 gguf model. WAN 2.1 T2V works much better in Vulkan on my system, but it's slow as hell.
There is a MacOS version of Vulkan, which I downloaded. I do have programming experience but not in this kind of code, so I can only post my observations.
After turning on debug messages the crash seems to be happening because the 'pipeline' in this part of the code is null
vk_pipeline pipeline = ggml_vk_op_get_pipeline(ctx, src0, src1, src2, node, node->op);
ggml_pipeline_request_descriptor_sets(ctx, pipeline, 1);
node->op is GET_ROWS - whatever that means
Sounds like missing quant types for GET_ROWS (as in #851 ). What types do you see in src0->type and src1->type ?
I am getting the same unsupported op 'IM2COL_3D' error with Linux and Vulkan too.
@evcharger : Are you running a build with today's source code (master-306-2abe945)? I just tested Vulkan built with it using wan2.1_t2v_1.3B_fp16.safetensors and it's working as expected.
Ok with today's version it works, although the output is gibberish and actually all outputs with today's version with Ubuntu and Vulkan produce gibberish with most models :(
Ok with today's version it works, although the output is gibberish and actually all outputs with today's version with Ubuntu and Vulkan produce gibberish with most models :(
This is likely #847 , fixed by https://github.com/ggml-org/llama.cpp/commit/9073a73d82a916cea0809de225ef5175c3a86e91 (but still missing from ggml).
Yes, the original problem has gone away after I sync'ed the code to the latest revision. However, when I ran it my Mac's display started blinking and I killed it out of caution
My last comment was too quick. I added --offload-to-cpu and then the run proceeded until it hit the same original problem
https://github.com/CLDawes/ggml/tree/patch-qwen-image
Metal support for IM2COL_3D, DIAG_MASK_INF, and a fix for PAD to make it pass the test suite.
I got QuantStack's Qwen-Image-GGUF (Q6_K) running on a Mac Mini M4 Pro so I wouldn't have to spend money on a graphics card to futz about, and that's all I wanted out of this.
It works in master-331-90ef5f8, but I don't know if Wan takes advantage of any additional unimplemented/buggy operations.
https://github.com/CLDawes/ggml/tree/patch-qwen-image
I guess this would also fix #857 ?
@CLDawes , ggml changes are usually applied to llama.cpp first, then extracted to the lib, then pulled in by sd.cpp; so I suggest you submit that as a llama.cpp PR.