manifold icon indicating copy to clipboard operation
manifold copied to clipboard

Remove thrust option

Open fire opened this issue 2 years ago • 31 comments

To make manifold easier to build there was a proposal to remove thrust for C++ primitives.

This is tracking that thrust removal proposal.

fire avatar Jun 16 '22 16:06 fire

In fact what we are doing with https://github.com/elalish/manifold/blob/master/utilities/include/par.h is building a proxy for some of the thrust APIs, although there are still some stuff like thrust transform iterators elsewhere in the code. I think we can make it a compile time option to use stl implementation instead of thrust and use std::execution::par if the compiler support it, so users who don't want to build the CUDA variant don't have to pull in the thrust dependencies.

~I think it will take a long time for stl to support GPU offloading for parallel algorithms, and I doubt if the performance will be portable. In our code, we often switch between CPU and GPU processing, and we use CUDA unified memory to try to minimize data transfer. For a general implementation, e.g. based on OpenCL, iirc there is no unified memory available.~ It turns out OpenCL has Shared Virtual Memory (SVM). But OpenCL Kernel launch seems to be slower due to the need of runtime compilation (OpenCL kernels are C strings that are submitted to the device driver to compile...).

pca006132 avatar Jun 16 '22 16:06 pca006132

I don't think the game industry uses opencl. There seems to be move towards technologies like onnx and vulkan compute.

fire avatar Jun 16 '22 18:06 fire

For vulkan I guess the only way we can do GPU accelerated computation is through compute shader? Wondering if we can do complicated things like CUDA and what is the overhead of launching it...

Anyway I think we can just provide one with sequential (multithreaded implementation if the compiler is new) if you need to disable thrust.

pca006132 avatar Jun 17 '22 13:06 pca006132

There are other gpus than Nvidia so calling CUDA isn't able to support Godot Engine's platform matrix for example on the Apple Macs.

fire avatar Jun 17 '22 14:06 fire

I believe someone here already got Manifold compiling on Apple Macs with the Thrust C++ backend. The nice thing about what @pca006132 did switching everything over to .cpp files and pulling in thrust as a proper third_party dep is that it works on a bunch of different compilers now, no longer needing nvcc or CUDA, since Thrust is just a C++ header library. The only thing you need an Nvidia GPU for is if you want the CUDA backend.

elalish avatar Jun 17 '22 15:06 elalish

The other issues worth mentioning:

  • [x] I made a script with degit to copy the dependencies without git submodules https://github.com/V-Sekai/godot/blob/csg-manifold-07/modules/csg/checkout.sh
  • [ ] Need to remove exceptions from thrust since they cause warnings and code with warnings are not mergable.
  • [X] Possible to compile on the mac, will need to check again.

fire avatar Jun 17 '22 16:06 fire

For removing exceptions, can you open a separate issue to track that? I think I can work on that as I have experience using rust, which does not use exceptions. I think the good way to do this might be to abort on some invariant check (assertion fault is considered a bug on the library but not user input error), and data validation failure will return some error code.

Removing exceptions from thrust would be a bit tricky though, I guess the only way to do it is to write a proxy module that provides those iterator helpers that we use, and use a compile-time option to decide whether or not to include thrust.

pca006132 avatar Jun 17 '22 16:06 pca006132

Yeah, a separate issue sounds good regarding exceptions we already had a bit of sub-thread about this, ending here, but I have to admit I'm still a bit confused as to which exceptions are okay vs not (STL containers, Thrust, our own, etc). I still think exception code is easier to read than return codes, so I'm not super excited to switch to that. Obviously exceptions shouldn't get hit much, which I think we're doing a decent job of - they're basically just indicating bad input (which you can check first on your side anyway) or halting before something segfaults.

elalish avatar Jun 17 '22 17:06 elalish

I'll post a new issue for exceptions. Have to get some things done first. Today or a few days.

fire avatar Jun 17 '22 17:06 fire

I may have some trouble getting Godot Engine maintainers to accept this amount of changes.

image

The problematic parts are google test, glm and thrust.

fire avatar Jul 13 '22 18:07 fire

I understand your concerns here, but I think making every dependency optional will take a lot of effort, requires copying a bunch of code from these dependencies, and complicate the build process even more (thrust is optional but we have to support both cases).

pca006132 avatar Jul 14 '22 00:07 pca006132

I did a grep to see what do we need to mock glm and thrust. These are the functions that we use:

glm::abs
glm::acos
glm::all
glm::asin
glm::atan
glm::cos
glm::cross
glm::degrees
glm::dot
glm::greaterThanEqual
glm::half_pi
glm::inverse
glm::isfinite
glm::ivec3
glm::length
glm::mat2
glm::mat2x3
glm::mat2x4
glm::mat3
glm::mat3x2
glm::mat3x4
glm::mat4
glm::mat4x3
glm::max
glm::min
glm::mix
glm::normalize
glm::pi
glm::radians
glm::rotate
glm::sqrt
glm::transpose
glm::tvec2
glm::tvec3
glm::tvec4
glm::two_pi
glm::uint64_t
glm::vec2
glm::vec3
glm::vec4
thrust::binary_function
thrust::counting_iterator
thrust::get
thrust::identity
thrust::iterator_difference
thrust::lower_bound
thrust::make_counting_iterator
thrust::make_pair
thrust::make_permutation_iterator
thrust::make_transform_iterator
thrust::make_tuple
thrust::make_zip_iterator
thrust::maximum
thrust::minimum
thrust::negate
thrust::pair
thrust::permutation_iterator
thrust::placeholders
thrust::swap
thrust::transform_iterator
thrust::tuple
thrust::unary_function
thrust::zip_iterator

in addition to the functions in par.h. Some of these have standard library implementations but some do not.

pca006132 avatar Jul 14 '22 01:07 pca006132

Yeah, thrust and GLM are pretty fundamental dependencies. They're header only libs that are well tested. If that's not acceptable, I don't think there's much we can do for you. You should have no need for Google test, presumably you have your own test system in Godot. Not sure why you'd want to bring ours.

elalish avatar Jul 14 '22 04:07 elalish

I don't think you want to include the meshIO, test, or tools directories either. I wonder if we should organize our directories differently to make this more clear or simpler?

elalish avatar Jul 14 '22 04:07 elalish

Reorganizing if it's easy to do can help.

I think glm is fine, and there was some mentioned of being able to remove thrust previously, so I mentioned it.

fire avatar Jul 14 '22 06:07 fire

I think I can work on making thrust optional, but the problem is that there are quite a few iterators that we depends on, e.g. zip_iterator, that requires either thrust or boost iterator. Not sure if there is an iterator library that is smaller but provides the same functionalities.

pca006132 avatar Jul 15 '22 02:07 pca006132

Reorganizing is easy for us; the question is what's easy for users? Does anyone have any suggestions?

elalish avatar Jul 15 '22 05:07 elalish

I can try looking for a replacement for zip_iterator.

https://github.com/dpellegr/ZipIterator

I think reducing the number of dependencies while keeping performance will provide the widest scope of usability.

fire avatar Jul 15 '22 05:07 fire

https://github.com/NVIDIA/MatX Can someone evaluate this?

fire avatar Aug 29 '22 17:08 fire

https://github.com/NVIDIA/MatX Can someone evaluate this?

That looks if anything like an even bigger dependency than Thrust (it's higher-level and not really what we need). @fire how big of a blocker is the Thrust dependency for you? Most of it has already been pulled into the STL in the form of C++17 parallel algorithms, which is what we would move to, so that's not exactly different, dependency-wise.

elalish avatar Aug 31 '22 02:08 elalish

Rather than replacing one dependency with another, how about decoupling it with traits classes?

pentacular avatar Aug 31 '22 04:08 pentacular

Rather than replacing one dependency with another, how about decoupling it with traits classes?

I have been thinking about this and tried to use std functions and structures instead of those of thrust's, but CUDA doesn't like it so we reverted it. I think we may need to write a simple header only library for some of the constructs like transform iterator, pairs, and make sure that CUDA is happy with that.

pca006132 avatar Aug 31 '22 08:08 pca006132

I'd like to get the boolean core as a header only library as well, for similar reasons -- but I've been a bit busy so far.

pentacular avatar Aug 31 '22 09:08 pentacular

I looked at the thrust dependency and it's a blocker in two ways, we can't enable the cuda dependency and thrust has a lot of code.

fire avatar Aug 31 '22 14:08 fire

@pentacular header-only should be doable; I suppose you could even make the template parameter choose double or single precision. Double would be nice for large-scale applications. I mostly chose single because it's better supported on GPUs and most applications don't need the extra precision. I didn't do header-only because I wanted to keep compile time down, but I'm not sure exactly how much difference that'll make.

elalish avatar Sep 01 '22 04:09 elalish

I looked at the thrust dependency and it's a blocker in two ways, we can't enable the cuda dependency and thrust has a lot of code.

@fire, CUDA isn't a dependency - in fact we use Thrust specifically so we can target non-GPU architectures and compilers, as shown in our CI (which has no GPU, nor CUDA toolkit). Thrust is a lot of code, but using the STL will be no less code - it's basically just the same header-only library, but included by the compiler instead of us. In fact Thrust is included by the nvcc compiler, which is why you don't usually see it listed as a dependency for other GPU libraries. Our WASM build is only 150kb, so in terms of binary size I'd say it's not so bad. Can you be more specific about what the bar is we need to cross in order to unblock you?

elalish avatar Sep 01 '22 04:09 elalish

@pentacular header-only should be doable; I suppose you could even make the template parameter choose double or single precision. Double would be nice for large-scale applications. I mostly chose single because it's better supported on GPUs and most applications don't need the extra precision. I didn't do header-only because I wanted to keep compile time down, but I'm not sure exactly how much difference that'll make.

I think the compile time for CUDA will be a bit high, but it should be very fast to build for CPU. Most of our build time in the CI was spent building assimp.

pca006132 avatar Sep 07 '22 02:09 pca006132

Thrust is a lot of code, but using the STL will be no less code

I know this is a weak argument, but it was difficult to "read" the library of thrust because of it's large scope.

Some reasons with the slow updates was that doing mesh operations with meshes and having them crash with merging operations was frustrating. Many of the fixes were commited to manifold master, but I wasn't able to evaluate before the Godot 4.0 merge window ended.

fire avatar Sep 07 '22 07:09 fire

@fire Totally understandable! I really appreciate your feedback in making our library more easily consumable and also your testing. I certainly hope it's working better for you now. When you get a chance, it would be great to know if you're still seeing any crashes, or if the only problem now is really just non-manifold input. I'd like to add manifold repair, but that will be very heuristic and I don't have a solid plan for how to do it yet.

elalish avatar Sep 07 '22 16:09 elalish

So my reading of the Godot Engine policies is we probably will be slotted under the "gdextension" policy and be a different shared library meaning we can use cuda and omp, but it also means it won't be in core. So the non-human understandable size of the Thrust problem, requiring openmp or requiring CUDA may be ignored.

fire avatar Sep 08 '22 18:09 fire