Mixed Precision Grouped Gemm with zero points and GPT-Q semantics closes #2261
sorry, running a bit behind. we will get to it soon.
@ankutalev Thanks for submitting this feature MR. Have you checked the functionality of this feature? Could you post the result of running this feature (example 69) here?
@ankutalev Thanks for submitting this feature MR. Have you checked the functionality of this feature? Could you post the result of running this feature (example 69) here?
Yes, I checked - it shows "Disposition Passed" for all scenarios ({shuffled/unshuffled} X {direct convert, no zeros, zeros, gptq}). Which is not good also - because new gptq semantics dequantizes matrix in different way; the test in examples is weak.
I can provide unit tests if you like.
Also I don't like the way I implemented gptq mode switch, but runtime parameters seems like "not cutlass style"; I will apreciate any advices and suggestions here =)
We are interested in this functionality in main branch, because nobody likes to have patched forks =)
@Junkai-Wu Hi! any uppdates here?
@ankutalev we are reviewing the changes internally. Will merge this PR once got approved and merged in our internal repo.
@ankutalev we are reviewing the changes internally. Will merge this PR once got approved and merged in our internal repo.
Hi! Any news here?
This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.
@ankutalev we are reviewing the changes internally. Will merge this PR once got approved and merged in our internal repo.
Gentle ping