iree
iree copied to clipboard
[Codegen] Add arith.clip/arith.clamp
Clipping or clamping is defined as:
clip(x, min_value, max_value) = min(max(x, min_value), max_value)
Some backends can generate better instructions if it's known we are clamping a value. For example, on gfx9, we have med3 instructions, which can be used to clamp a value in a range. See:https://github.com/ROCm/composable_kernel/pull/1182/files for an example of such optimization. Not only is it more efficient, it also gives better range information for a range analysis.
Tasks:
- Add arith.clip to upstream mlir (to arith/math, depending on discussion)
- Add optimized codegen to intrinsic for gfx9 for arith.clip
I will work on this. So the task is simply:
- Add the
clipto thearithdialect at the llvm repo. - when lowering arith dialect to LLVM, I should handle gfx9 case to use med3, right? are there any other cases that uses med3, or other better lowering?
@Groverkss @kuhar @ScottTodd, Ping