quda icon indicating copy to clipboard operation
quda copied to clipboard

Use kernel arg

Open jxy opened this issue 3 years ago • 10 comments

We first make sure when BlasArg is instantiated, the argument for functor caxpyxmazMR_ is truly passed as a kernel argument. Then we make use_kernel_arg an int, and introduce it to BlasFunctor and caxpyxmazMR_, and let device::use_kernel_arg know that we really want use_kernel_arg if its value is more than 1.

jxy avatar Jul 27 '22 16:07 jxy

Jenkins: Can one of the admins verify this patch?

mathiaswagner avatar Jul 27 '22 16:07 mathiaswagner

What's the motivation for this?

mathiaswagner avatar Jul 27 '22 17:07 mathiaswagner

This ensures that BlasArg for caxpyxmazMR_ is private to each thread, even if device::max_kernel_arg_size() is 0. It provides backend implementations freedom to choose different ways to pass kernel arguments without accidentally break the code.

jxy avatar Jul 27 '22 19:07 jxy

I've changed int to an enum, and since it is a custom type, we need a few extra member functions in kernel_param so that we can keep target_device.h independent of kernel_helper.h.

I also fixed HeavyQuarkResidualNorm_ and xpyHeavyQuarkResidualNorm_, which received the same treatment as caxpyxmazMR_ now.

jxy avatar Aug 08 '22 20:08 jxy

I think we should make the enum a enum class which is more error proofing. Also personally I am fine with the it but I remember the enum's in QUDA needs to all be upper case?

hummingtree avatar Aug 09 '22 13:08 hummingtree

Yes, for consistency with QUDA, I would strongly prefer all caps for the enum. Also, instead of yes / no / always, I would prefer TRUE / FALSE / ALWAYS

maddyscientist avatar Aug 09 '22 13:08 maddyscientist

Apologies if I was unclear @jxy. What I meant for the enum names was to use USE_KERNEL_ARG_FALSE, USE_KERNEL_ARG_TRUE, USE_KERNEL_ARG_ALWAYS.

maddyscientist avatar Aug 09 '22 14:08 maddyscientist

This is enum class, now. The names are properly scoped. You actually can't directly use FALSE/TRUE/ALWAYS as it is without the enum class name and double colon, unless we move to c++20 and use using enum declaration.

jxy avatar Aug 09 '22 15:08 jxy

That's a good point, that it's scoped so the prefix is unnecessary.

maddyscientist avatar Aug 09 '22 15:08 maddyscientist

Latest discussion on portability call is that we should be able to delete the static_asserts in blas_core.cuh and reduce_core.cuh which should be redundant given the static_asserts in the target launcher.

@jcosborn to review this PR.

maddyscientist avatar Sep 14 '22 21:09 maddyscientist

Removed static_assert as we discussed. I left comments in the constructors to remind ourselves in the future.

jxy avatar Oct 27 '22 16:10 jxy

test this please

maddyscientist avatar Nov 12 '22 00:11 maddyscientist

@Jenkins test this please

mathiaswagner avatar Nov 16 '22 19:11 mathiaswagner