Make module swap the main QAT flow again

Open andrewor14 opened this issue 1 year ago • 1 comments

Stack from ghstack (oldest at bottom):

#1020
#1038
-> #1037

Summary: Following https://github.com/pytorch/ao/issues/987, this commit makes module swap the main QAT flow today. We remove all tensor subclass fake quantize injection logic since this is not needed in both the long term and the short term plans for QAT. In the short term, we will continue to use a full module swap flow, and only migrate to the long term flow once there is general distributed support for tensor subclasses and when tensor subclass composability provides meaningful benefits.

Test Plan: python test/quantization/test_qat.py

Oct 08 '24 19:10 andrewor14

:link: Helpful Links

:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1037

:page_facing_up: Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

:white_check_mark: No Failures

As of commit 0609a38c604251b98cd06b19074dc35cf82accf1 with merge base 35ea27b33d79d3966278b87d4bfa4f862f18e5db (): :green_heart: Looks good so far! There are no failures yet. :green_heart:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Oct 08 '24 19:10 pytorch-bot[bot]