circle icon indicating copy to clipboard operation
circle copied to clipboard

circle: "expected return statement in function int atomicAdd(int*, int)" when calling CUDA atomic functions

Open jaredhoberock opened this issue 3 years ago • 7 comments
trafficstars

The program

#include <device_atomic_functions.h>

__global__ void kernel(int* ptr)
{
  if(ptr)
  {
    atomicAdd(ptr, 1);
  }
}

int main()
{
  kernel<<<1,1>>>(0);

  return 0;
}

Causes circle to yield the error

$ circle --cuda-path=/usr/local/cuda -sm_60 repro.cpp 
ODR used by: int main()
repro.cpp:13:9
  kernel<<<1,1>>>(0); 
        ^

(sm_60) ODR used by: void kernel(int*)
repro.cpp:7:14
    atomicAdd(ptr, 1); 
             ^

  error: /usr/local/cuda/include/device_atomic_functions.h:106:71
  ... included from repro.cpp:1:10
  expected return statement in function int atomicAdd(int*, int)
  __DEVICE_ATOMIC_FUNCTIONS_DECL__ int atomicAdd(int *address, int val) __DEF_IF_HOST 

Compiler details:

$ ~/Desktop/circle/circle --version
circle version 1.0.0-155
  Circle public preview build 155
  Built Feb  8 2022 11:15:07
  (c) 2021 Sean Baxter
  https://www.circle-lang.org/
  Twitter: @seanbax

jaredhoberock avatar Feb 13 '22 03:02 jaredhoberock

Is this what you have on your system?

#ifndef __CUDA_ARCH__
#define __DEF_IF_HOST { }
#else  /* !__CUDA_ARCH__ */
#define __DEF_IF_HOST ;
#endif /* __CUDA_ARCH__ */

Not sure what to do if NVIDIA won't fix their headers. An empty definition for atomicAdd is useless.

seanbaxter avatar Feb 13 '22 19:02 seanbaxter

Yes, I see similar definitions on my system.

jaredhoberock avatar Feb 13 '22 20:02 jaredhoberock

I was surprised and delighted that this equivalent program just works:

#include <atomic>

#define NDEBUG
#include <cassert>
#include <cuda_runtime_api.h>


__global__ void kernel(int* ptr)
{
  if(ptr)
  {
    std::atomic_ref<int>(*ptr).fetch_add(1);
  };
}

__managed__ int value;

int main()
{
  value = 0;

  kernel<<<1,1>>>(&value);
  cudaDeviceSynchronize();

  assert(value == 1);

  return 0;
}

jaredhoberock avatar Feb 13 '22 20:02 jaredhoberock

The definition of atomicAdd that you want is in device_atomic_functions.hpp. (Yes there are definitions in different files.) But including that breaks on undeclared identifiers __any and __all. I remember removing those, along with old __ballot, because they were obsolete. The intrinsics __nvvm_vote_all_sync etc were intended to replace them. I think the atomicAdd would work, but due to poor header hygiene it can't be included yet.

@allisonvacanti @brycelelbach Is there a robust way to get oldschool CUDA atomics included using a tagless frontend? Do __any and __all still need to be implemented as compiler builtins?

seanbaxter avatar Feb 14 '22 22:02 seanbaxter

What do you mean by tagless frontend?

brycelelbach avatar Feb 18 '22 21:02 brycelelbach

Like circle or nvc++. The current headers is attaching a bogus { } definition to host atomicAdd, and the correct definition to device atomicAdd, and using CUDA_ARCH to separate the declarations. Jared includes device_atomic_functions.h, and since CUDA_ARCH isn't defined, it pulls in the empty definition { } which breaks, since there is no return statement.

seanbaxter avatar Feb 18 '22 21:02 seanbaxter

Yah unfortunately there's no resolution to this ATM.

brycelelbach avatar Feb 18 '22 21:02 brycelelbach