Set underlying type for enum class exec_tag to uint16_t
Set underlying type for enum class exec_tag to ::cuda::std::uint16_t, rather than the 32-bit default integral type.
This change reduces size of exec_tag instance from 4 bytes to 2 bytes, it also makes it more explicit what underlying type exec_tag is using.
CI failures are caused by network issues while fetching CPM script, or other dependencies.
I'm not a fan of hardcoding this to a specific type, and would rather just use std::underlying_type to deduce the integral type. Is this necessary for python?
@alliepiper No, this change is not prompted by work on Python API. I felt that explicitly using uint16_t as underlying type explicitly expresses the flag bitwidth.
The library binary size is unaffected by this change.