aimet icon indicating copy to clipboard operation
aimet copied to clipboard

TypeError: forward() got an unexpected keyword argument 'weight'

Open WithFoxSquirrel opened this issue 2 years ago • 8 comments

As subject description, I use aimet-1.26-torch (QAT) to optimize nanodet-cspnet. procedure as following step 1. load datasets and train nanodet-cspnet (trainning by pytorch_lightning 1.7.0) step 2. calculate_quantsim_accuracy(QuantizationSimModel and compute_encodings) step 3. apply_cross_layer_equalization step 4. finetue the quantsim.model

Exception has occurred: TypeError
forward() got an unexpected keyword argument 'weight'

crash happened, when run to nanodet/model/head/gfl_head.py , call the self.loss_bbox (line:333)

            # regression loss
line:333        loss_bbox = self.loss_bbox(
                pos_decode_bbox_pred,
                pos_decode_bbox_targets,
                weight=weight_targets,
                avg_factor=1.0,
            )

whichi loss_bbox is

class GIoULoss(nn.Module):
    def __init__(self, eps=1e-6, reduction="mean", loss_weight=1.0):
        super(GIoULoss, self).__init__()
        self.eps = eps
        self.reduction = reduction
        self.loss_weight = loss_weight

    def forward(
        self,
        pred,
        target,
        weight=None,
        avg_factor=None,
        reduction_override=None,
        **kwargs,
    ):

GIoULoss forward() has the argument weight, i suspect quansim.model remove the attribute argument, and error happened. Has any one encounter this before ? thanks lot

WithFoxSquirrel avatar Jun 28 '23 07:06 WithFoxSquirrel

@quic-hitameht, could you help answer this

quic-mangal avatar Jun 28 '23 17:06 quic-mangal

@WithFoxSquirrel Thanks for reporting this. Could you tell us in which step are you running into this TypeError?

Also, inside nanodet/model/head/gfl_head.py file, when calling self.loss_bbox (line:333), could you please pass weight and avg_factor as positional arguments instead of keyword arguments? Let us know if you still run into the same TypeError with following change.

loss_bbox = self.loss_bbox(
                     pos_decode_bbox_pred,
                     pos_decode_bbox_targets,
                     weight_targets,
                     1.0,)

quic-hitameht avatar Jul 03 '23 09:07 quic-hitameht

@WithFoxSquirrel Thanks for reporting this. Could you tell us in which step are you running into this TypeError?

Also, inside nanodet/model/head/gfl_head.py file, when calling self.loss_bbox (line:333), could you please pass weight and avg_factor as positional arguments instead of keyword arguments? Let us know if you still run into the same TypeError with following change.

loss_bbox = self.loss_bbox(
                     pos_decode_bbox_pred,
                     pos_decode_bbox_targets,
                     weight_targets,
                     1.0,)

I have tried your solution, but still assert at loss_bbox = self.loss_bbox

Exception has occurred: AssertionError
Not enough tensor quantizers (1) allocated
  "File:/Qualcomm/aimet-1.26/aimet/Examples/nanodet/nanodet/model/head/gfl_head.py", line 333, in loss_single

it's maybe crash at qc_quantize_op.py line:820 or 821

  def forward(self, *inputs):
        """
        Forward-pass routine. This quantizes the weights before delegating to the wrapped module and
        then quantizes the output before returning the same
        :param inputs: Inputs passed to the module in the forward pass
        :return: Quantized output from the wrapped module
        """

        self.apply_gating_logic()

        # Quantize inputs
line:820        torch_inputs = custom_tensor_utils.to_torch_tensor(inputs)
line:821        quantized_inputs = self._quantize_activation(torch_inputs, self.input_quantizers, 'input')

WithFoxSquirrel avatar Jul 04 '23 06:07 WithFoxSquirrel

I try to debug the aimet_torch source code, but something error when i setup up aimet-1.26 from source code. so i install Aimet-torch_gpu_1.26.0-cp38-cp38-linux_x86_64.wheel instead.

WithFoxSquirrel avatar Jul 04 '23 06:07 WithFoxSquirrel

AssertionError is raised here because there is a mismatch between tensors to be quantized and its allocated tensor quantizers. The number of tensors are more than the allocated tensor quantizers (1). This assertion is raised inside self._quantize_activation method. Since, I can't reproduce this on my end, could you please tell us the number of input_quantizers (len(self.input_quantizers)) and output quantizers (len(self.output_quantizers)) and tensor to be quantized for both input and output for self.loss_bbox module. After the above suggested change, number of inputs should be 4 (pos_decode_bbox_pred, pos_decode_bbox_targets, weight_targets, avg_factor) and output should be 1 (loss_bbox).

You can get above information simply by printing the QuantizationSimModel object.

quic-hitameht avatar Jul 04 '23 08:07 quic-hitameht

Since,Something error when i set up aimet_torch with source code, i can't get the len(self.input_quantizers) and len(self.output_quantizers) , Just print QuantizationSimModel object after quanterize Fragments:

----------------------------------------------------------
....

Layer: head.distribution_project
  Input[0]: Not quantized
  -------
  Output[0]: bw=8, encoding-present=True
    StaticGrid TensorQuantizer:
    quant-scheme:QuantScheme.post_training_tf_enhanced, round_mode=RoundingMode.ROUND_NEAREST, bitwidth=8, enabled=True
    min:0.0, max=9.476959228515625, delta=0.03716454654932022, offset=0.0
  -------
----------------------------------------------------------
Layer: head.loss_qfl
  Input[0]: Not quantized
  -------
  Output[0]: Not quantized
  -------
----------------------------------------------------------
Layer: head.loss_dfl
  Input[0]: Not quantized
  -------
  Output[0]: Not quantized
  -------
----------------------------------------------------------
Layer: head.loss_bbox
  Input[0]: Not quantized
  -------
  Output[0]: Not quantized
  -------
----------------------------------------------------------
Layer: head.cls_convs.0.0.conv
.....

I find the layer: head.loss_bbox has not been quantized , it's that the result of assertionerror, but i don't know how this happened.

WithFoxSquirrel avatar Jul 05 '23 02:07 WithFoxSquirrel

Hi ,

I have similar issue. From the error log, rnn call is causing this problem.

Layer: rnn
  Input[0]: Not quantized
  -------
  Param[ih]: Not quantized
  -------
  Param[hh]: Not quantized
  -------
  Output[0]: Not quantized
  -------
----------------------------------------------------------

surajpandey353 avatar Sep 12 '23 14:09 surajpandey353

hello, I have a problem about use pytorch lighting to do QAT process my step:

  1. get the model use pytorch lightning Model moudle: model = Model(config)
  2. covert the model to quantisim use qimet api and calibration process: prepared_model = prepare_model(model) sim = QuantizationSimModel( prepared_model, dummy_input=dummy_input.cuda(), quant_scheme=QuantScheme.training_range_learning_with_tf_init, default_output_bw=8, default_param_bw=8 )
  3. finetune the quantisim use pytorch lightning api trainer = get_trainer(config) trainer.fit(sim.model, dataset)

but when I run the last step to finetune the model, it seems after I use QuantizationSimModel api to change the model type to 'GraphModule' and it can not use pytorch lightning api to train this ? could you tell me how did you finetune your model ?

Traceback (most recent call last):
  File "incam_e2e_aimet.py", line 232, in <module>
    main()
  File "incam_e2e_aimet.py", line 225, in main
    trainer.fit(sim.model, dataset)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 737, in fit
    self._call_and_handle_interrupt(
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 772, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1120, in _run
    self._callback_connector._attach_model_callbacks()
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/connectors/callback_connector.py", line 266, in _attach_model_callbacks
    model_callbacks = self.trainer.call_hook("configure_callbacks")
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1481, in call_hook
    prev_fx_name = pl_module._current_fx_name
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1269, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'GraphModule' object has no attribute '_current_fx_name'

zhuoran-guo avatar Sep 28 '23 01:09 zhuoran-guo