circle
circle copied to clipboard
circle: "expected return statement in function int atomicAdd(int*, int)" when calling CUDA atomic functions
The program
#include <device_atomic_functions.h>
__global__ void kernel(int* ptr)
{
if(ptr)
{
atomicAdd(ptr, 1);
}
}
int main()
{
kernel<<<1,1>>>(0);
return 0;
}
Causes circle to yield the error
$ circle --cuda-path=/usr/local/cuda -sm_60 repro.cpp
ODR used by: int main()
repro.cpp:13:9
kernel<<<1,1>>>(0);
^
(sm_60) ODR used by: void kernel(int*)
repro.cpp:7:14
atomicAdd(ptr, 1);
^
error: /usr/local/cuda/include/device_atomic_functions.h:106:71
... included from repro.cpp:1:10
expected return statement in function int atomicAdd(int*, int)
__DEVICE_ATOMIC_FUNCTIONS_DECL__ int atomicAdd(int *address, int val) __DEF_IF_HOST
Compiler details:
$ ~/Desktop/circle/circle --version
circle version 1.0.0-155
Circle public preview build 155
Built Feb 8 2022 11:15:07
(c) 2021 Sean Baxter
https://www.circle-lang.org/
Twitter: @seanbax
Is this what you have on your system?
#ifndef __CUDA_ARCH__
#define __DEF_IF_HOST { }
#else /* !__CUDA_ARCH__ */
#define __DEF_IF_HOST ;
#endif /* __CUDA_ARCH__ */
Not sure what to do if NVIDIA won't fix their headers. An empty definition for atomicAdd is useless.
Yes, I see similar definitions on my system.
I was surprised and delighted that this equivalent program just works:
#include <atomic>
#define NDEBUG
#include <cassert>
#include <cuda_runtime_api.h>
__global__ void kernel(int* ptr)
{
if(ptr)
{
std::atomic_ref<int>(*ptr).fetch_add(1);
};
}
__managed__ int value;
int main()
{
value = 0;
kernel<<<1,1>>>(&value);
cudaDeviceSynchronize();
assert(value == 1);
return 0;
}
The definition of atomicAdd that you want is in device_atomic_functions.hpp. (Yes there are definitions in different files.) But including that breaks on undeclared identifiers __any and __all. I remember removing those, along with old __ballot, because they were obsolete. The intrinsics __nvvm_vote_all_sync etc were intended to replace them. I think the atomicAdd would work, but due to poor header hygiene it can't be included yet.
@allisonvacanti @brycelelbach Is there a robust way to get oldschool CUDA atomics included using a tagless frontend? Do __any and __all still need to be implemented as compiler builtins?
What do you mean by tagless frontend?
Like circle or nvc++. The current headers is attaching a bogus { } definition to host atomicAdd, and the correct definition to device atomicAdd, and using CUDA_ARCH to separate the declarations. Jared includes device_atomic_functions.h, and since CUDA_ARCH isn't defined, it pulls in the empty definition { } which breaks, since there is no return statement.
Yah unfortunately there's no resolution to this ATM.