iree icon indicating copy to clipboard operation
iree copied to clipboard

Missing support for vectorizing quantized convolution ops

Open hanhanW opened this issue 2 years ago • 1 comments

Follow up from https://github.com/google/iree/issues/8411, the quantized convolution ops are not vectorized. This introduces temp buffer allocation because types mismatch. We landed https://github.com/google/iree/pull/8526 to work it around. Ideally, we'd like to vectorize those operations as well.

We have a flow to vectorize convolution ops today, the missing part is -- adding a pattern to convert the quantized version into a normal version like for matmul (or maybe extend the vectorization logic to directly account for the zero points if necessary).

It does not block quantized model exploration. We can prioritize this issue once quantized convolution-base models are in our tracking list. Just filing an issue for tracking it.

hanhanW avatar Apr 14 '22 21:04 hanhanW

Unassigned myself as I don't have time recently to work on this P1 issue

pzread avatar Sep 09 '22 16:09 pzread

Please, @vmurali coordinate with @rsuderman

dcaballe avatar Oct 11 '22 22:10 dcaballe

Yes this should be working and committed. We can decompose quantized convolutions into regular integer convolutions with additional use of reductions and average pooling layers.

rsuderman avatar Jan 04 '23 23:01 rsuderman