thrust
thrust copied to clipboard
partial fix for iterator traits and tags
The symptoms of this problem are described in NVIDIA/cccl#705. Really, there are two problems with the way Thrust's iterator tags are currently defined:
-
The
*_device_*
iterator tags are erroneously convertible the*_host_*
tags. That's because the*_host_*
tags are simply type aliases for the standard iterator tags, which the*_device_*
tags inherit from. -
Because of the way the device iterator tags are defined (with
iterator_category_with_system_and_traversal
) there is no relationship between the different device tags. We would expect (and much of the code does) thatthrust::forward_device_iterator_tag
is convertible tothrust::input_device_iterator_tag
, but that is sadly not true.
The two traits, iterator_category_to_system
and iterator_category_to_traversal
, are first checking whether the category is convertible to a host iterator tag. That is trivially true even for the device tags because of (1); as a result, device tags appear to have the "host" system, which is incorrect.
This PR offers the simplest, least intrusive, and sadly partial fix to the problem reported in NVIDIA/cccl#705. It changes the two traits so that they first check for convertibility to the device tag. This gets the answer correctly in more cases than currently.
I think I know how to fix this properly, but not without potentially breaking some code, so I'm submitting this PR now as a stop-gap.
run tests
@allisonvacanti I see you tagged this with "2.0" milestone, but this is a change that is safe to make in the 1.X series, IMO.
this is a change that is safe to make in the 1.X series, IMO.
It would be, but the RC is already out. We only restart the RC process to add fixes for regressions.
run tests
I suppose if an iterator had any_system_tag
as its system tag, then this diff would change its reported system from host to device. :thinking:
@allisonvacanti, do any iterators use any_system_tag
?
do any iterators use any_system_tag?
Yes -- this tag is used by iterators that aren't backed by actual memory (e.g thrust::counting_iterator
, thrust::constant_iterator
, etc, as well as several in CUB).
I suppose if an iterator had any_system_tag as its system tag, then this diff would change its reported system from host to device. 🤔
This should be fine. The device system is typically faster than the host system, so this is the preferred behavior. If an algorithm takes an any_system
iterator and also a host_system
iterator (e.g. thrust::copy(any_system_it, any_system_it_end, host_system_it_begin)
) and requires the algorithm to run on host, it should still work since we check the systems of all iterators while dispatching.
@ericniebler @allisonvacanti
Is this ready to merge?
Is this ready to merge?
I ... think so? I can rebase, retest and commit if all looks good.
run tests
Hm that one is strange:
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(70): error: qualified name is not allowed
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(72): error: expected a "("
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(72): error: expected a type specifier
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(72): error: expected a ")"
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(76): error: expected a "("
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(76): error: expected a type specifier
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(77): error: expected a ";"
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(78): error: expected a "("
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(78): error: expected a ";"
/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ext/numeric_traits.h(88): error: class template "__gnu_cxx::__numeric_traits_integer<_Value>" has no member "__is_signed"
There is nothing in this PR that could have triggered this
@miscco we have wait till https://github.com/NVIDIA/thrust/pull/1866 is merged.