torch-mlir icon indicating copy to clipboard operation
torch-mlir copied to clipboard

Simplify lowering of aten.reflection_pad2d to linalg

Open frederik-h opened this issue 1 year ago • 7 comments

The lowering uses a linalg.extract_slice operation to extract the input slices to be used for the padding followed by a linalg.generic operation to reverse the order of elements as required for the reflection.

This PR removes the use of the linalg.generic. It uses the capability of the linalg.extract_slice operation to extract a reversed slice by specifying negative strides and offsets that point to the end of the slice in the input tensor. This simplifies the generated IR and makes it potentially more optimization friendly.

frederik-h avatar Jan 19 '24 13:01 frederik-h

Nice simplification!

Thanks!

frederik-h avatar Jan 22 '24 16:01 frederik-h

Just want to get a second opinion first. @MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

frederik-h avatar Jan 22 '24 17:01 frederik-h

Just want to get a second opinion first. @MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

Not saying we shouldn't support this, just that there are a few places I could see this going wrong. I'll unblock for now but (at least from IREE's perspective, who I think is the main consumer of these Linalg lowerings) it's not immediately clear to me this is better than the previous lowering. The end state I'm expecting anyway is to support + use tensor.pad.

I hope my first message didn't come off the wrong way at first either, I do appreciate the code changes/simplifications here, thanks for the good work :)

qedawkins avatar Jan 22 '24 17:01 qedawkins

Just want to get a second opinion first. @MaheshRavishankar do you have a good idea of how well supported negative extraction strides are in tensor.extract_slice?

I found out that they are supported because of this https://reviews.llvm.org/D134147. If it turns out that there is not reason not to use them - as I expect - I think the tensor dialect documentation and test suite should be extended to cover the possibility.

Not saying we shouldn't support this, just that there are a few places I could see this going wrong. I'll unblock for now

Thanks, but I will wait a bit for feedback from @MaheshRavishankar before I merge the changes.

The end state I'm expecting anyway is to support + use tensor.pad.

Something like this would probably be ideal. However I am wondering if it makes sense to extend tensor.pad to be able to take tensors for the padding on the different sides of the input instead of using tensor.extract and complex select operations in the body of tensor.pad.

I hope my first message didn't come off the wrong way at first either, I do appreciate the code changes/simplifications here, thanks for the good work :)

It didn't! Thanks for your feedback.

frederik-h avatar Jan 22 '24 17:01 frederik-h

I think it depends on what torch-mlir wants to model, but in general I think negative strides are a bad idea. They add unnecessary complications to the rest of the stack. I am fairly certain this is relatively untested path (the patch is from end of 2022, and not sure how well it was plumbed through). I dont think I have standing to block this, cause really these are downstream issues, but I would try to avoid negative strides without making sure they work in an end-to-end stack.

MaheshRavishankar avatar Jan 22 '24 17:01 MaheshRavishankar

I think it depends on what torch-mlir wants to model, but in general I think negative strides are a bad idea. They add > unnecessary complications to the rest of the stack.

@stellaraccident What do you think? I just wanted to try out if this works and I do not have any strong opinions on what is better. I see that this can cause downstream issues if the possibility of negative strides has not been documented previously and there have been no uses so far.

frederik-h avatar Feb 08 '24 07:02 frederik-h

I expect that Ivan actually added it to support cases like this (but when coming from users, where you have fewer options to avoid), given what I know of his work.

Stride hacks like this are a common thing to do in linear algebra libraries, but as Mahesh says... Sometimes it is best to stay on more sure footing.

You can ask Ivan (he's on discord as hc84) and/or test it out and understand corners.

Since this is an "internal" detail vs something we're trying to model from the user, I would probably take the more sure path and focus my time in getting to the end state.

stellaraccident avatar Feb 08 '24 08:02 stellaraccident