ZLUDA
ZLUDA copied to clipboard
Support Meshroom
Will close #79 and #149
Hi, I'm from the meshroom team. Would you need some help ?
Yes, if you have an AMD GPU can you check if Meshroom works under ZLUDA? Instructions are in this PR's README or here: https://github.com/vosen/ZLUDA/issues/79#issuecomment-2019264969. I've ran a sample dataset, but I'm entirely unfamiliar with Meshroom and I'm not sure if I'm using it properly.
Additionally, is there way to make sure that https://github.com/alicevision/CCTag/pull/210 gets merged and then picked by Alicevision? It not only adds support for CUDA 12, but also makes it so that CUDA code gets compiled with PTX, which is required by ZLUDA.
@vosen https://github.com/alicevision/CCTag/pull/210 is merged now
@vosen CCTag PR has been merged and will be used in the next release. But in all case, CCTag is NOT used in the default photogrammetry pipeline. It's only use for specific use cases (scale the scene to real world coordinate system, etc). The critical point for all users is to get the depth map node working. Unfortunately, we don't have an AMD card in the team to test. Let us know if we can help.
@fabiencastan Many thanks. CCTag is not used by you, but it's used by the CUDA runtime :). CCTag us built with CUDA runtime and it's up to CUDA runtime to decide when to load kernel files. In some cases CUDA runtime will try to do this at the earliest possible moment. CUDA runtime will call cuModuleLoadData(...) trying to load the ELF binary with kernel modules, ZLUDA returns CUDA_ERROR_NOT_SUPPORTED, this gets surfaced to whatever was the initial CUDA function and Meshroom returns failure. I'm not 100% sure under what conditions this happens. I can reproduce it with Linux meshroom_batch, but not with Windows executables. BTW, ZLUDA is not blameless here. We could handle this and return fake CUmodule, but it's a tricky feature - we can't just return an empty CUmodule because usually the next step from the runtime is to load all globals and kernels from the module. We have to parse out globals and kernels from the ELF
Anyway, I'm merging this
Is there a fix for CUDA_ERROR_NOT_SUPPORTED on linux?