taichi
taichi copied to clipboard
[refactor] Remove redundant codegen of BitExtractStmt
Issue: #6134
Deploy Preview for docsite-preview canceled.
Name | Link |
---|---|
Latest commit | b32a8f56f33c1053d13aaa8003fed6fe58216834 |
Latest deploy log | https://app.netlify.com/sites/docsite-preview/deploys/632c2d324ac26100086af250 |
There is actually hardware support on both Nvidia and AMD for native bit extract (a lot faster than using lower level bit operations). Maybe we should keep them not demoted?
There is actually hardware support on both Nvidia and AMD for native bit extract (a lot faster than using lower level bit operations). Maybe we should keep them not demoted?
Could you elaborate a bit on what should we do to make use of that? Regarding this PR, demote_operations
is just an abstract layer of what current codegen of different backends does.
There is actually hardware support on both Nvidia and AMD for native bit extract (a lot faster than using lower level bit operations). Maybe we should keep them not demoted?
Could you elaborate a bit on what should we do to make use of that? Regarding this PR,
demote_operations
is just an abstract layer of what current codegen of different backends does.
We would want to be able to optionally demote BFE when backend says it's not supported, and if supported the codegen will want to take the BFE CHI-IR and generate the backend native instruction
We would want to be able to optionally demote BFE when backend says it's not supported, and if supported the codegen will want to take the BFE CHI-IR and generate the backend native instruction
Sounds good. Then I think not performing demotion in CHI IR is a better choice. I'll now close this PR and submit another one in the future.