iree icon indicating copy to clipboard operation
iree copied to clipboard

[CPU][ArmSME] Enable transposes for f32 and f64

Open c-rhodes opened this issue 1 year ago • 4 comments

Since [1] transposes may get eliminated by folding into defining vector.transfer_read.

[1] https://github.com/llvm/llvm-project/pull/92562

ci-extra: build_test_all_arm64

c-rhodes avatar May 20 '24 15:05 c-rhodes

Can you add ci-extra: build_test_all_arm64 as the last line of your PR summary (and push something to re-trigger CI)?

MacDue avatar May 20 '24 16:05 MacDue

Can you add ci-extra: build_test_all_arm64 as the last line of your PR summary (and push something to re-trigger CI)?

Looks like CI ran but I'll add it to summary for any updates.

c-rhodes avatar May 21 '24 07:05 c-rhodes

Can you add ci-extra: build_test_all_arm64 as the last line of your PR summary (and push something to re-trigger CI)?

Looks like CI ran but I'll add it to summary for any updates.

Just noticed ci-extra: build_test_all_arm64 was skipped, will push to trigger.

c-rhodes avatar May 21 '24 07:05 c-rhodes

I've moved this to draft, will re-open once https://github.com/llvm/llvm-project/pull/92562 lands in the next LLVM integration.

c-rhodes avatar May 21 '24 09:05 c-rhodes

Thanks Cullen!

Sorry for the delay, I've kept skipping this one without realising that this is the patch :) I blame the title:

[CPU][ArmSME] Enable transposes for f32 and f64

That's not what's happening here, is it? IIUC, this patch adds logic to identify transpose-like linalg.generic Ops and then, for ArmSME, adds "scalable" flags to the tiling config. For X86 there should be no changes, right?

That's my understanding based on the code - is this correct? If yes, could you update the summary?

Thanks Cullen!

Sorry for the delay, I've kept skipping this one without realising that this is the patch :) I blame the title:

[CPU][ArmSME] Enable transposes for f32 and f64

That's not what's happening here, is it? IIUC, this patch adds logic to identify transpose-like linalg.generic Ops and then, for ArmSME, adds "scalable" flags to the tiling config. For X86 there should be no changes, right?

It is what's happening, albeit indirectly since linalg.transpose ops are converted to linalg.generic by GeneralizeLinalgNamedOps`. But yes, no changes to X86.

That's my understanding based on the code - is this correct? If yes, could you update the summary?

Not quite, but I'll clarify in the summary to make it clearer 👍

c-rhodes avatar May 31 '24 13:05 c-rhodes

Would be good to wait for either @hanhanW or @dcaballe review.

AFAIK, Diego is distracted with other activities these days :)

This looks good to me, what do you think @hanhanW and @MacDue ?

banach-space avatar Jun 03 '24 08:06 banach-space

Thanks, just one nit! LGTM!

thanks for reviewing! Fixed the final nit 👍

c-rhodes avatar Jun 04 '24 17:06 c-rhodes