[BUG]: Use of raft::span initialize in per_v_transform_e.cuh results in failure of is_allowed_element_type_converstion_t
Version
25.02
Which installation method(s) does this occur on?
No response
Describe the bug.
In compiling 25.02, specifically the file src/prims/detail/per_v_transform_e.cuh, one runs in an error in the following code block:
std::optional<std::variant<raft::device_span<uint32_t const>, raft::device_span<size_t const>>>
hypersparse_non_deg1_key_offsets{std::nullopt};
if constexpr (filter_input_key) {
if (edge_partition_hypersparse_key_offset_vectors) {
auto const& offsets = (*edge_partition_hypersparse_key_offset_vectors)[j];
if (offsets.index() == 0) {
hypersparse_non_deg1_key_offsets = raft::device_span<uint32_t const>(
std::get<0>(offsets).data(),
std::get<0>(offsets).size() -
(edge_partition_deg1_hypersparse_key_offset_counts
? (*edge_partition_deg1_hypersparse_key_offset_counts)[j]
: size_t{0}));
} else {
hypersparse_non_deg1_key_offsets = raft::device_span<size_t const>(
std::get<1>(offsets).data(),
std::get<1>(offsets).size() -
(edge_partition_deg1_hypersparse_key_offset_counts
? (*edge_partition_deg1_hypersparse_key_offset_counts)[j]
: size_t{0}));
}
The particular error occurs in the if/else block where hypersparse_non_deg1_key_offsets is intialized using raft::device_span (both instances, on in the if block, the other in the else block. Tracking this down one can see that the failure occurs in raft's span.hpp where you have
95 /**
96 * @brief Initialize a span class from another one who's underlying type is convertible
97 * to element_type.
98 */
99 template <class U,
100 std::size_t OtherExtent>
101 class = typename std::enable_if<
102 detail::is_allowed_element_type_conversion_t<U, T>::value &&
103 detail::is_allowed_extent_conversion_t<OtherExtent, Extent>::value>>
104 constexpr span(const span<U, is_device, OtherExtent>& other) noexcept
105 : span{other.data(), other.size()}
106 {
110 }
The error occurs because is_allowed_element_type_conversion_t returns false, causing the enable_if to fail and thus not implement the constructor. The compiler then is unable to find a suitable constructor amongst the available options. It appears this is just a 'failure' of span to be willing to convert one integer pointer to another another integer pointer of a different size.
Minimum reproducible example
One can see a simplified version of this error in the following code
template <class From, class To>
struct is_allowed_element_type_conversion_t
: public std::integral_constant<bool, std::is_convertible<From (*)[], To (*)[]>::value> {};
int main() {
static_assert(is_allowed_element_type_conversion_t<uint32_t , size_t >::value);
return 0;
}
Relevant log output
Environment details
Other/Misc.
No response
Code of Conduct
- [x] I agree to follow cuGraph's Code of Conduct
- [x] I have searched the open bugs and have found no duplicates for this bug report
You mean this code tries to assign a raft::device_span<uint32_t const> to a raft::device_span<size_t const> object? hypersparse_non_deg1_key_offsets is a std::optional object of a std::variant. The std::variant object should be able to store either raft::device_span<uint32_t const> or raft::device_span<size_t const> without type conversion.
Or did I misinterpret your question?
Just below that. So the two instances where you try to initialize from hypersparse_non_deg1_key_offsets from a call to device_span<>, e.g.
hypersparse_non_deg1_key_offsets = raft::device_span<uint32_t const>(
std::get<0>(offsets).data(),
std::get<0>(offsets).size() -
(edge_partition_deg1_hypersparse_key_offset_counts
? (*edge_partition_deg1_hypersparse_key_offset_counts)[j]
: size_t{0}));
The other uses of the std::optional<std::variant<>> stuff in the rest of this compilation unit don't try to use this particular way to initialize the the lhs so they don't run in to this problem but this one does and while there's nothing wrong per se with what you're doing here it runs afoul of the the is_allowed_element_type_conversion test in raft which causes the enable_if to fail and to keep this particular initialization method to throw an error because the constructor fails to exist. The problem is mostly on the raft side because the is_allowed_element_type_conversion test doesn't look to see if I can convert an int to a long say, it looks if I can convert an int * to a long * and that's where the fail occurs and what the simple test was meant to show. I hope that clears things up.
I am not following this.
Which type & version of compiler are you using? This code path is actually used and I haven't seen a build error.
Initializing a raft::devcie_span<long const> object from a raft::device_span<int const> object is not always valid; so I think the enable_if failure is valid (how should we initialize 8B array from 4B array of size 10, 8B array of size 5?). I am not sure why this conversion is attempted.
If we are assigning raft::device_span<uint32_t const> to the std::variant object, the object should be initialized to raft::device_span<uint32_t const> (no need to convert this to raft::device_span<size_t const>).
Let me know if I am misinterpreting your question or I am missing something else. And it will be really helpful if I can reproduce your build error.
I'm just pointing out something in your code that is not standards compliant and passes when it shouldn't. Clang catches this. Maybe nvcc interprets this differently but I don't understand why yet. What I do know is that it fails the is_allowed_element_type_convert_t test and nvcc for some reason passes it (yet doesn't pass it in the simplified example I sent which, honestly, after years of doing this isn't surprising...i.e. a complex example failing and a simplified example passing....this is just the reverse).
So you have a situation where
A = B
B is used to initialize A. To do this we need to call an operator= member function in span.hpp which acts as our constructor. This leads us here:
/**
* @brief Initialize a span class from another one who's underlying type is convertible
* to element_type.
*/
template <class U,
std::size_t OtherExtent,
class = typename std::enable_if<
**detail::is_allowed_element_type_conversion_t<U, T>::value** &&
detail::is_allowed_extent_conversion_t<OtherExtent, Extent>::value>>
constexpr span(const span<U, is_device, OtherExtent>& other) noexcept
: span{other.data(), other.size()}
{
}
So the bit in bold is the bit that should fail based on what it's trying to do. In span, that test looks like
template <class From, class To>
struct is_allowed_element_type_conversion_t
: public std::integral_constant<bool, std::is_convertible<From (*)[], To (*)[]>::value> {};
If the test were different, i.e.
template <class From, class To>
struct is_allowed_element_type_conversion_t
: public std::integral_constant<bool, std::is_convertible<From , To>::value> {};
This would pass if From was uint32_t and To was size_t
However,the test is trying to determine if 'uint32_t (*)[]' (a pointer to an array) is convertible to size_t {*)[]. You'd have to have some kind of 'reinterpret_cast' in there for that to happen. My simple example shows even nvcc reports that it can't do that so it's a bit of a mystery to me why it passes the compilation phase at all.
So you have a situation where
A = B
B is used to initialize A. To do this we need to call an operator= member function in span.hpp which acts as our constructor. This leads us here:
Your explanation makes perfect sense if the type of A is raft::device_span<size_t const> and the type of B is raft::device_span<uint32_t const>. But here, the type of A is std::variant<raft::device_span<uint32_t const>, raft::device_span<size_t const>>`.
std::variant<T, U> var{};
if (...) {
var = T(...);
}
else {
var = U(...);
}
What I am struggling to understand is why the constructor to create T from U (or U from T) is invoked, here. If this is attempted, it should fail, but this code shouldn't require this.
I am digging into the documentation of std::optional and std::variant to figure out the exact expected behavior by the standard, but at first glance, I feel like this type conversion should not be necessary.
Going to close this, it does not seem like the type conversion should be necessary. If you have a more specific example that you can share that fails, please create a new issue with that example so we can explore further.