nncf icon indicating copy to clipboard operation
nncf copied to clipboard

node matching error while using NNCF with Transformers-LT (fairseq)

Open sharathns93 opened this issue 3 years ago • 3 comments

Seeing a node matching error (logs attached) while using NNCF with a Transformer model based on the fairseq implementation: error_logs_nncf.txt

Steps to replicate the issue:

  1. git clone https://github.com/mit-han-lab/hardware-aware-transformers.git

  2. cd hardware-aware-transformers

  3. pip install --editable .

  4. pip install -U torch==1.9.1

  5. pip install nncf

  6. bash configs/wmt14.en-de/get_preprocessed.sh

  7. git apply nncf_diff.txt
    (nncf_diff.txt)

  8. python train.py --configs=configs/wmt14.en-de/subtransformer/[email protected][email protected] --sub-configs=configs/wmt14.en-de/subtransformer/common.yml --validate-subtransformer --cpu

sharathns93 avatar Feb 14 '22 05:02 sharathns93

Could reproduce the original issue and it seems to be fixed in #1600. However, after that I experience another failure:

Traceback (most recent call last):
  File "train.py", line 407, in <module>
    cli_main()
  File "train.py", line 403, in cli_main
    main(args)
  File "train.py", line 74, in main
    compression_ctrl, model = create_compressed_model(model, nncf_config)
  File "/home/vshampor/work/nncf/nncf/telemetry/decorator.py", line 71, in wrapped
    retval = fn(*args, **kwargs)
  File "/home/vshampor/work/nncf/nncf/torch/model_creation.py", line 110, in create_compressed_model
    compressed_model = builder.apply_to(nncf_network)
  File "/home/vshampor/work/nncf/nncf/torch/compression_method_api.py", line 123, in apply_to
    transformation_layout = self.get_transformation_layout(model)
  File "/home/vshampor/work/nncf/nncf/torch/compression_method_api.py", line 142, in get_transformation_layout
    layout = self._get_transformation_layout(model)
  File "/home/vshampor/work/nncf/nncf/torch/quantization/algo.py", line 634, in _get_transformation_layout
    self._pt_quantizer_setup = self._get_quantizer_setup(target_model)
  File "/home/vshampor/work/nncf/nncf/torch/quantization/algo.py", line 720, in _get_quantizer_setup
    single_config_quantizer_setup = self._get_single_config_quantizer_setup(target_model)
  File "/home/vshampor/work/nncf/nncf/torch/quantization/algo.py", line 713, in _get_single_config_quantizer_setup
    single_config_quantizer_setup = setup_generator.generate_setup()
  File "/home/vshampor/work/nncf/nncf/torch/quantization/algo.py", line 396, in generate_setup
    quantization_proposal = prop_graph_solver.run_on_ip_graph(merged_ip_graph)
  File "/home/vshampor/work/nncf/nncf/common/quantization/quantizer_propagation/solver.py", line 505, in run_on_ip_graph
    quantizer_setup = quant_prop_graph.create_quantizer_setup(self._weight_quantizable_node_names_vs_qconfigs)
  File "/home/vshampor/work/nncf/nncf/common/quantization/quantizer_propagation/graph.py", line 1141, in create_quantizer_setup
    setup = self._handle_output_quantizers_for_weights_as_outputs_ops(setup, pqid_vs_qpid,
  File "/home/vshampor/work/nncf/nncf/common/quantization/quantizer_propagation/graph.py", line 1178, in _handle_output_quantizers_for_weights_as_outputs_ops
    wao_qp_id = wao_op_node_key_vs_wq_id[wao_op_node_key]
KeyError: '283 TransformerSuperModel/TransformerDecoder[decoder]/EmbeddingSuper[embed_tokens]/embedding_0'

vshampor avatar Feb 24 '23 16:02 vshampor

@vshampor , is it still valid? If yes, any idea about the second failure?

MaximProshin avatar Jun 20 '23 08:06 MaximProshin

Still reproduces as of today with the second failure, investigating.

vshampor avatar Jun 22 '23 16:06 vshampor