Add `cuda.core.utils.get_aligned_dtype`
Many CUDA Python projects require the ability to declare a custom alignment for a give dtype. While NumPy builtin/structure dtypes work in most cases, this is one of the few issues that it cannot fulfill today. @seberg has figured out a prototype to address this issue (https://github.com/numpy/numpy/issues/28318#issuecomment-2678883667), and the plan is for cuda.core to provide this infrastructure, let other teams start using it, gather feedbacks, and eventually upstream this back to NumPy.
cc @shwina @tpn @ZzEeKkAa @isvoid for vis
@zzeekkaa does Sebastian's prototype work for you out of box, or does it require additional tweaks?
Sebastian's prototype works out of the box as an intermediate solution for nvmath. However complex* types as well as half precision packed complex type are required