libcudacxx icon indicating copy to clipboard operation
libcudacxx copied to clipboard

CUDA Tuple Structured Binding Declaration Broken

Open jdwapman opened this issue 1 year ago • 4 comments

I'm running into issues where cuda::std::tuple does not seem to support structured binding declarations. Is this a feature that should work but is broken? Is this unsupported for cuda::std::tuple specifically?

Note that the normal std::tuple supports this cpp17 feature, even in device code. If I comment out the cuda_tuple_kernel and its calls in the main() function, the code compiles without issues.

Code:

#include <cuda/std/tuple>
#include <tuple>

#define CHECK_CUDA(cmd)                                  \
  do {                                                   \
    cudaError_t res = (cmd);                             \
    if (res != cudaSuccess) {                            \
      fprintf(stderr, "CUDA: %s = %d (%s)\n", #cmd, res, \
              cudaGetErrorString(res));                  \
      abort();                                           \
    }                                                    \
  } while (0)

// This works
__global__ void std_tuple_kernel()
{
  std::tuple<bool, float> my_tup = std::make_tuple(true, 1.0f);

  printf("Direct access: %d, %f\n", std::get<0>(my_tup), std::get<1>(my_tup));

  auto [first, second] = my_tup;

  printf("Structured binding: %d, %f\n", first, second);
}

// This fails to compile
__global__ void cuda_tuple_kernel()
{

  cuda::std::tuple<bool, float> my_tup = cuda::std::make_tuple(true, 1.0f);

  printf("Direct access: %d, %f\n", cuda::std::get<0>(my_tup), cuda::std::get<1>(my_tup));

  auto [first, second] = my_tup;

  printf("Structured binding: %d, %f\n", first, second);
}

int main()
{

  std_tuple_kernel<<<1, 1>>>();
  CHECK_CUDA(cudaDeviceSynchronize());

  cuda_tuple_kernel<<<1, 1>>>();
  CHECK_CUDA(cudaDeviceSynchronize());
}

Compile command:

nvcc tuple_issues.cu --expt-relaxed-constexpr -std=c++17 -gencode=arch=compute_80,code=compute_80

Compile error:

tuple_issues.cu(22): error: cannot bind to non-public member "cuda::std::__4::tuple<_Tp...>::__base_ [with _Tp=<__nv_bool, float>]"

System Info: GPU: A100 nvcc: 11.7.64 g++: 9.4.0 OS: Ubuntu 20 LTS

jdwapman avatar Sep 20 '22 22:09 jdwapman

Hm, confirmed: https://godbolt.org/z/YPdYdaGTW

@wmaxey is this a known problem? It looks like our structured binding tests for tuple are commented out: https://github.com/NVIDIA/libcudacxx/blob/4b1a1df666f3bac26f60cb06cda051ca6fc9b158/.upstream-tests/test/std/utilities/tuple/tuple.tuple/tuple.helper/tuple_size_structured_bindings.pass.cpp#L49-L67

The comment indicates a possible compiler bug as well.

@miscco you've been looking at tuple stuff lately, so you may be able to help look into this as well.

jrhemstad avatar Sep 20 '22 23:09 jrhemstad

Interesting, I've been running into general std::tuple compile errors in nvcc as well. For example:

template <typename T>
class TestTupleMember
{

public:
    // remove constexpr to fail compile on nvcc >= 11.7)
    constexpr __host__ __device__ TestTupleMember(T _data)
    {
        data = _data;
    }

    T data;
};

__global__ void make_tuple_kernel()
{
    auto my_tup = std::make_tuple(TestTupleMember(1), TestTupleMember(2));

    auto [first, second] = my_tup;

    printf("Structured binding: %d, %d\n", first.data, second.data);
}

The above code fails to compile unless I add a constexpr in front of the __host__ statement, but only for 11.7 and up. 11.6.2 compiles successfully. I know this might be out of scope for libcudacxx, but maybe it helps track down the compiler issue.

jdwapman avatar Sep 20 '22 23:09 jdwapman

Note that this succeeds if I use std::make_tuple with a basic type. It's only a class or struct that fails to compile.

jdwapman avatar Sep 20 '22 23:09 jdwapman

Hm, confirmed: https://godbolt.org/z/YPdYdaGTW

@wmaxey is this a known problem? It looks like our structured binding tests for tuple are commented out:

https://github.com/NVIDIA/libcudacxx/blob/4b1a1df666f3bac26f60cb06cda051ca6fc9b158/.upstream-tests/test/std/utilities/tuple/tuple.tuple/tuple.helper/tuple_size_structured_bindings.pass.cpp#L49-L67

The comment indicates a possible compiler bug as well.

@miscco you've been looking at tuple stuff lately, so you may be able to help look into this as well.

So this is a "I should consider gardening" moment.

The issue at hand is that structured bindings only work when the respective tuple machinery is declared in namespace std.

You can have a look at this here https://godbolt.org/z/h438MMfhM

miscco avatar Sep 21 '22 06:09 miscco