AliceVision [Request] Remove CUDA dependency

I realize this is a lot to just ask without the time and ability to implement and maintain the necessary changes myself. Hence, I’ll spare you the long-winded political speech. The gist of it is: You can’t call something free and open source software when it depends on and endorses proprietary components whose only purpose is vendor lock-in.

AliceVision should be able to work without CUDA, no matter how glacially slowly. I prefer inefficient CPU-only computation that spills registers and caches all over the place over a requirement for GPUs with inferior Linux support and very unstable drivers (my GTX 960 keeps freezing my computer whenever NVIDIA’s driver decides it doesn’t want to do memory management anymore, and it is IMPOSSIBLE to report this problem to NVIDIA unless you’re a big corp and there’s money involved). I simply don’t have the patience to deal with this garbage, and I desperately want to move to a different GPU vendor so I get proper support for my platform.

Ideally, the CUDA parts should be ported to an open platform such as Vulkan or the older OpenCL.

Jul 09 '18 20:07 mia-0

Currently, we have neither the interest nor the resources to do another implementation of the CUDA code to another GPU framework. If someone is willing to make this contribution, we will support and help for integration.

Jul 10 '18 13:07 fabiencastan

I was looking into this some to see if there were any tools to make such a transition easier I found a project called swan thats is meant to make it very simple to effectively 'translate' Cuda kernels and code into OpenCL equivalents. Though it has not been updated in some time, so it may not help very much.

I feel like its worth pointing out also that OpenCL works on most embedded GPUs, integrated GPUs, and many FPGAs include drop in modules to allow OpenCL functionality. All of this means that if a change like this was made to AliceVision there would be many new potential uses. Such as micro computer clusters, or use on mobile devices directly.

Jul 14 '18 10:07 cody-code-wy

It's difficult to find a good solution in this technology war with Apple deprecation of OpenGL and OpenCL: https://developer.apple.com/macos/whats-new#deprecationofopenglandopencl

Another interesting project on this topic is HIP: https://gpuopen.com/compute-product/hip-convert-cuda-to-portable-c-code

Jul 14 '18 11:07 fabiencastan

That one is easy though: ditch OSX support. OSX is IME by far the worst and most buggy implementation of POSIX APIs that I've had to work with.

Jul 14 '18 11:07 zvrba

I agree that theres no particularly good solution currently.

I agree that apple's depreciation of OpenCL could be somewhat problematic, but I feel like its worth pointing out that relatively few of apple's systems have any support for Nvidia cards so CUDA is not much better for supporting Mac OS.

Also HIP looks like a pretty nice option. There seems to have been a few interesting similar projects in the past like gpuocelot, which is sadly now defunct.

Apperently Vulkan can be used for GPGPU, and thats supported in windows and linux on both AMD and Nvidia, and with MoltenVK on anything supporting apple's Metal APIs. But Vulkan is still pretty new so there not much info out there about using it for GPGPU...

Jul 15 '18 08:07 cody-code-wy

I would be interested in trying Halide as it enables to write high-level algorithms but also enable fine tuning of the scheduling. And then it generates code for each target.

HPG2017_FastImageProcessing halide-inria-march2017

Jul 15 '18 08:07 fabiencastan

ISPC (https://ispc.github.io/) could be another option. It also has an (experimental) PTX backend.

Jul 15 '18 08:07 zvrba

Halide looks like a pretty good option. While there is no metal backend yet it looks like (from issues on their github) a few people may be working on one, but obviously osx has OpenCL still for now.

And with support for ARM v7/NEON it could be used on Raspberry PIs (2 and later) and the like, and even android devices. That could seriously open up what AliceVision could be used for in the future.

Jul 24 '18 20:07 cody-code-wy

I'm skeptical about using something not backed by a major industry vendor. Halide is an academic project, they may get tired of developing it (when they've exhausted publishable stuff), they probably don't care about breaking changes (from the homepage: "These academic publications describe the ideas behind Halide and its scheduling model. Halide syntax changes over time, so don't rely on them for correct syntax."), etc.

Tools from major industry vendors (nvidia, intel) aren't open-source. So what?

If there's a viable alternative to CUDA, it's SYCL (Khronos standard; opencl using modern c++, i.e., something resembling CUDA), but the downside is that there are no free (as in beer) quality compilers that I'm aware of.

OpenCL seems to be the most future-oriented as it can support FPGAs as well. Intel has acquired Altera and another FPGA manufacturer, and OpenCL tooling will probably follow.

Jul 25 '18 07:07 zvrba

We are trying to compile Meshroom and AliceVision with Linux but it's sad to discover that it will work only with a proprietary solution that i do not have ( i use AMD GPU with Mesa driver).

Jul 27 '18 16:07 AndreaMonzini

To me the issue is: I do have the hardware, but it is just unstable as hell, requiring a lot of power cycles (since even the reset buttons stop working). Have been able to reproduce this with multiple kernel versions, driver versions, motherboards, GPUs, PSUs… It’s safe to say that it’s not a hardware issue, other than potential firmware bugs.

Anyway, my suggestion is to take a step back from all the frameworks and try to get just a basic C implementation done, with no drastic optimization whatsoever. My belief is that this will make future native ports (Vulkan, etc.) and SIMD optimization much easier, especially for outside contributors, because C is much more accessible. Also, before deciding on frameworks in an attempt to cover all potential use cases, it’s probably best to understand the challenges and requirements by doing a clean implementation with minimal external dependencies first.

Jul 27 '18 17:07 mia-0

hi @fabiencastan is there a way to support a solution like HIP or Halide ? Maybe an open-source bounty?

I think that the support for only 1 GPU vendor with proprietary GPGPU solution sounds limiting for a very promising free and open source project.

I could find and buy a proprietary software alternative for the photogrammetry but i prefer to support free and open source software and i use AMD GPU for its free and open source drivers.

https://github.com/ROCm-Developer-Tools/HIP

Anyway thank you for sharing your work :)

Jul 30 '18 12:07 AndreaMonzini

example of HIP porting:

https://gpuopen.com/ported-caffe-hip-heres-happened/

Aug 10 '18 14:08 AndreaMonzini

Hi everyone,

I read through the comments and it seems like the ditching of OpenCL/GL in the new OSX versions gives the developers a tiny headache on what computing language to use for this program. I am a Mac user and since following the "development" of new macs (with metal1&2), it seems to me like the are ditching every other computing enviroment. Despite the fact that the last Nvidia GPUs used in any models was around 2013 and with the upcoming and already existing empire of Metal, this propably won't change soon. Just want to give my view on the OSX "issue" ^^

Have a good one

Aug 28 '18 10:08 Ashtreighlia

Hello, for what understand HIP uses C++ so it should be compatible without OpenCL.

https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_faq.md#how-does-hip-compare-with-opencl

Aug 28 '18 11:08 AndreaMonzini

Here is output after running hip (rocm) converter:

adi@adi-ryzen7:~/kompilacje/AliceVision$ /opt/rocm/hip/bin/hipconvertinplace-perl.sh src
...
info: TOTAL-converted 713 CUDA->HIP refs( dev:153 mem:74 kern:150 coord_func:0 math_func:0 special_func:3 stream:0 event:0 err:7 def:3 tex:323 extern_shared:0 other:0 ) warn:39 LOC:665119
  warning: unconverted cudaReadModeNormalizedFloat : 9
  warning: unconverted cudaArraySurfaceLoadStore : 6
  warning: unconverted cudaExtent : 5
  warning: unconverted cudaMemcpy3DParms : 4
  warning: unconverted cudaMemcpy3D : 4
  warning: unconverted cudaMalloc3DArray : 3
  warning: unconverted cudaMalloc3D : 2
  warning: unconverted cudaMemcpy2DFromArray : 2
  warning: unconverted cudaMemcpyFromArray : 2
  warning: unconverted cudaPitchedPtr : 2
  kernels (2 total) :   nearestKernel(1)  pushPull_Pull_kernel(1)

Aug 30 '18 18:08 kwahoo2

@Storagraph The deprecation of OpenGL is not too much of a problem since there are multiple translation libraries which can convert to multiple graphics backends. Khronos have succeeded in getting Vulkan to run everywhere regardless of graphics API thanks to the portability initiative. MoltenVK enables vendors to target MacOS as well when using Vulkans compute shaders. So Vulkan is most portable option out there.

If anyone is intimidated by the Vulkan API, there is a project which reduces it's complexity: V-EZ

So the CUDA dependency could be removed if Vulkan compute shader were used.

Aug 30 '18 18:08 MrMinimal

@MrMinimal I just mentioned OpenGL for completeness. Vulkan/OpenGL/DirectX/D3D (graphic apis) are used for rasterization of 3D Objects and are generally not used for computing tasks, OpenCL (open computing language) is for computing. There is a work around by using SPIR-V to access OpenCL via the front-end in Vulkan, but doesn't this also need the support for OpenCL on OSX in the first place? Just to mention it, Apple announced in a press release, that will ditch both OpenCL & GL.

Sorry for the confusion ^^

Aug 30 '18 19:08 Ashtreighlia

@kwahoo2 thank you for the conversion with HIP, i think it could be the right solution with additional work.

Aug 31 '18 10:08 AndreaMonzini

Currently HIP doesn't support Windows and doesn't support amdgpu-pro driver under Linux (in fact only rocm platform under Linux is supported).

Aug 31 '18 11:08 PolarNick239

As supporter of free and open source software under Linux i prefer AMDGPU Mesa FOSS driver. I would like to support AliceVision also because it's a FOSS project and a FOSS driver like OpenCL, Vulkan, HIP or alternatives, would be the best solution in the FOSS perspective.

Sep 04 '18 13:09 AndreaMonzini

Hello, just to inform about a new interesting project based on Vulkan that could be useful:

https://github.com/jgbit/vuda

Oct 08 '18 12:10 AndreaMonzini

any chance to run AliceVision/Meshroom - CPU only - without any specialized hardware, without nVidia, ... ? most of the discussion i see here is about nVidia, CUDA, AMD, Vulkan, macOS, Metal, ... (voodoo :P)

i have only an older intel CPU (i7-3xxx) with a "built-in" intel GPU (HD-4000) - i don't need more GPU power than the GPU on CPU. to me, time doesn't matter...

Oct 17 '18 02:10 beta-tester

@MrMinimal I just mentioned OpenGL for completeness. Vulkan/OpenGL/DirectX/D3D (graphic apis) are used for rasterization of 3D Objects and are generally not used for computing tasks, OpenCL (open computing language) is for computing. There is a work around by using SPIR-V to access OpenCL via the front-end in Vulkan, but doesn't this also need the support for OpenCL on OSX in the first place? Just to mention it, Apple announced in a press release, that will ditch both OpenCL & GL.

Sorry for the confusion ^^

Looking at the diagram Bringing OpenCL Compute to Vulkan it looks like there will be no need for an OpenCL environment in the future, just a compiler to Vulkan code.

Nov 12 '18 14:11 EmteZogaf

One library which uses Vulkan compute shaders is libplacebo, which will be used by upcoming VLC releases. It should be obvious, but performance can vary wildly depending on implementation. Here’s an interesting tidbit: https://github.com/haasn/libplacebo/blob/master/demos/video-filtering.c

Nov 12 '18 16:11 mia-0

Hi there :)

I see a lot of options here shown by others and I honestly trust this project. ^.^

I initiate an open source video game, which makes great use of photogrammetry.

In order to do that, is the help of the community to capture and import pictures important. Easy accessibility in every possible way is obviously important.

I am interested why you implemented it in the first place as the only solution? You sit there and think "NVIDIA only will work great on an open source project"?

I heavily doubt that. I guess you had back then in mind that you can still add other solutions. You probably have forgotten about it since years passed by and you now get hinted about it again.

My reason to support open source is the same one as for 90% of the others.

To choose a software that is significantly behind in terms of features and performance, compared to software for which I have to pay 30€ per month is only logical when it provides another huge benefit, like being complete open source.

The only reason to support this project to me is the commitment that this is going to happen.

GPU acceleration is fine, CUDA is fine, NVIDIA only is questionable to me. I hope you see this issue :) I

Dec 09 '18 09:12 ShalokShalom

I think the only reason AliceVision ended up with CUDA is that CUDA, despite its lackluster documentation and stability issues, is extremely popular in academics.

Dec 09 '18 17:12 mia-0

So Vision is only for academics? And academics is already sold to NVIDIA? And this project supports this direction?

Dec 10 '18 09:12 ShalokShalom

@lachs0r CUDA having "lackluster documentation"?! It has more documentation than I've ever seen for OpenCL, it integrates nicely with C++ language, with Visual Studio debugger, it comes with decent performance/profiling tools... PLEASE, point me to an OpenCL implementation with as advanced tooling as CUDA. Probably Intel's implementation is a candidate, but that one is not as easy to get free (of charge) as CUDA. And then you're still stuck with low-level APIs that don't integrate nicely with C++ (I know of no free (of charge) quality SYCL implementation).

So if anything, it boils down to developer-friendliness.

Dec 10 '18 17:12 zvrba

We have CPU versions for the two feature extractors, only DepthMap is CUDA-only. The stage can be bypassed, but it is important for quality. If anyone has the time and skill to port DepthMap to CPU or OpenCL, it would be a welcome candidate for inclusion in a future release.

We have discussed a HIP conversion (to add AMD cards) in the past, but the oldest parts of the CUDA code use texture references, while HIP knows nothing about textures.

A pure CPU port would be easier, and useful in the long run.

We who worked on the original release have no time to do it, although we all see the benefit of being more open. I’d be happy to discuss with anybody who would like to try.

On 10 Dec 2018, at 18:05, Zeljko Vrba <[email protected]mailto:[email protected]> wrote:

@lachs0rhttps://github.com/lachs0r CUDA having "lackluster documentation"? It has more documentation than I've ever seen for OpenCL, it integrates nicely with C++ language, with Visual Studio debugger, it comes with decent performance/profiling tools... PLEASE, point me to an OpenCL implementation with as advanced tooling as CUDA. Probably Intel's implementation is a candidate, but that one is not as easy to get free (of charge) as CUDA. And then you're still stuck with low-level APIs that don't integrate nicely with C++ (I know of no free (of charge) SYCL implementation).

So if anything, it boils down to developer-friendliness.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/alicevision/AliceVision/issues/439#issuecomment-445891977, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACAsgvCRznIA0d2Djk97prgMh0vQtQeBks5u3pQ-gaJpZM4VIW_2.

Dec 12 '18 09:12 griwodz

AliceVision AliceVision copied to clipboard

[Request] Remove CUDA dependency

AliceVision
AliceVision copied to clipboard