Ben Barsdell

Results 37 comments of Ben Barsdell

I'll see if I can take another look at this later this week.

I believe the root cause of this is the `#include ` header being loaded from jitify's builtins and cached, and then, when `#include "climits"` is encountered within libcu++, jitify uses...

No, currently they are not shared, each kernel instantiation has its own cuModule, so the addresses will be different (I confirmed with a test). This is arguably a design flaw...

I think linking will have the same issue because there will still be multiple modules, unless I'm misunderstanding. > Would it not be possible to simply change the internals so...

Thanks for the PR! I filed an internal bug about the `__host__` `__device__` warnings; it seems to be a compiler issue. I believe it only affects debug builds, but I...

Thanks for this feedback, it's an important issue. The situation is a bit tricky. At its heart is the fact that having NVRTC load headers implicitly from the filesystem (which...

> We currently generate 1 dynamic header, which is not on the file-system, so this would create a problem for us. This will still be possible by providing the header...

In the jitify2 API (under development) you can do this: https://github.com/NVIDIA/jitify/blob/ca7f794/jitify2.hpp#L2153

To ensure cuda_fp16.h can be found you'll need to pass the CUDA Toolkit include directory as a flag like this: `-I/path/to/cuda/include`. Here's a minimal example (it uses `half` which is...

That's right. One option would be to use an environment variable like CUDA_PATH.