iree [Winograd] Add filtering by annotations for Winograd rewrites

This PR adds filtering functionality to Winograd conv rewrites based on a __winograd_conv attribute. Convolutions are expected to be annotated with this attribute before the pass invocation, most likely by a transform script called from iree-preprocessing-transform-interpreter as preprocessing.

Since Winograd affects numerical results, and the performance benefit is different for different shapes, it is usually necessary to select specific shapes to use Winograd. This PR enables users to make this selection based on shapes or parameters of specific convolutions.

May 09 '24 20:05 Max191

This PR adds filtering functionality to Winograd conv rewrites based on a __winograd_conv attribute.

Oh no, this looks like filter-based patterns. I wonder if we can have better matcher instead of using attribute?

May 09 '24 21:05 hanhanW

This PR adds filtering functionality to Winograd conv rewrites based on a __winograd_conv attribute.

Oh no, this looks like filter-based patterns. I wonder if we can have better matcher instead of using attribute?

There is not really a good way to tell if winograd is worth using for a specific convolution (it changes numerics and it does not necessarily help performance), so right now it has to be decided by the user. Maybe in the future there can be some selection based on benchmarking so we no longer rely on user input, but for now this is necessary.

May 09 '24 21:05 Max191

This PR adds filtering functionality to Winograd conv rewrites based on a __winograd_conv attribute.

Oh no, this looks like filter-based patterns. I wonder if we can have better matcher instead of using attribute?

The issue is that we don't have a good idea of when it is good to use a convolution for winograd vs something else, so for now we are having the user annotate it (that is what we did in the past). So these annotations are expected to be user provided.

Yeah what Max said.

May 09 '24 21:05 qedawkins

I think it might be better to move this whole pass to preprocessing.

IIUC is the "deployment mode" is that you run the transform dialect script first and then you run this pass?

Yeah, that is how it will work right now. I think it could make sense in Preprocessing, but I can move it in another PR later.

EDIT: The only way I could see it being potentially useful elsewhere is if we wanted to experiment with doing some transformation after partial tiling (to try and avoid global transient buffers). I haven't thought about that much at all, though, so have no idea if that would be a good idea.

May 10 '24 14:05 Max191

I think it might be better to move this whole pass to preprocessing. IIUC is the "deployment mode" is that you run the transform dialect script first and then you run this pass?

Yeah, that is how it will work right now. I think it could make sense in Preprocessing, but I can move it in another PR later.

EDIT: The only way I could see it being potentially useful elsewhere is if we wanted to experiment with doing some transformation after partial tiling (to try and avoid global transient buffers). I haven't thought about that much at all, though, so have no idea if that would be a good idea.

Then this needs to be documented/tested... cause now we are introducing an expected ordering between the transform dialect based preprocessing and pipeline options.

May 11 '24 01:05 MaheshRavishankar