nni What is the correct order to use DistributedDataParallel and QAT Quantizer?

What is the correct order to use DistributedDataParallel and QAT Quantizer?

Open neesetifa opened this issue 1 year ago • 0 comments

Describe the issue:

Environment:

NNI version: Master(3.0?)
Training service (local|remote|pai|aml|etc): local
Client OS: Arch Linux
Server OS (for remote mode only): N/A
Python version: 3.11
PyTorch/TensorFlow version: PyTorch 1.13
Is conda/virtualenv/venv used?: No
Is running in Docker?: No

Configuration:

Experiment config (remember to remove secrets!): N/A
Search space: N/A

Log message:

nnimanager.log:
dispatcher.log:
nnictl stdout and stderr:

How to reproduce it?: I'm trying to do QAT with DDP, but I'm confused with the order of initializing optimizer. According to Pytorch official code, definition of optimizer should happen after wrapping model in DDP. But in NNI, https://github.com/microsoft/nni/blob/master/nni/compression/quantization/qat_quantizer.py this example shows that we should have optimizer first, pass it into evaluator, then let QAT Quantizer wrap the model. I can't find any example code for DPP+QAT, could anyone help?

Oct 24 '23 10:10 neesetifa

nni nni copied to clipboard

What is the correct order to use DistributedDataParallel and QAT Quantizer?

nni
nni copied to clipboard