PaddleSlim
PaddleSlim copied to clipboard
run `example/auto_compression/semantic_segmentation` get NaN
Hi, I run run.py
as follow in PaddleSlim/example/auto_compression/semantic_segmentation
:
args.config_path = "configs/pp_liteseg/pp_liteseg_mine.yaml"
args.save_dir = './save_sparse_model'
but I got:
2022-07-22 12:59:49,955-INFO: quant_aware config {'weight_quantize_type': 'channel_wise_abs_max', 'activation_quantize_type': 'moving_average_abs_max', 'weight_bits': 8, 'activation_bits': 8, 'not_quant_pattern': ['skip_quant'], 'quantize_op_types': ['conv2d', 'depthwise_conv2d', 'mul', 'matmul', 'matmul_v2'], 'dtype': 'int8', 'window_size': 10000, 'moving_rate': 0.9, 'for_tensorrt': False, 'is_full_quantize': False, 'name': 'Distillation', 'loss': 'l2', 'node': [], 'alpha': 1.0, 'teacher_model_dir': '../../../../714/save/output', 'teacher_model_filename': 'model.pdmodel', 'teacher_params_filename': 'model.pdiparams'}
2022-07-22 13:00:36,223-INFO: Total iter: 0, epoch: 0, batch: 0, loss: [0.02632807]
2022-07-22 13:00:36,593-INFO: Total iter: 1, epoch: 0, batch: 1, loss: [0.0527272]
2022-07-22 13:00:36,777-INFO: Total iter: 2, epoch: 0, batch: 2, loss: [0.1315988]
2022-07-22 13:00:37,332-INFO: Total iter: 3, epoch: 0, batch: 3, loss: [0.3486606]
2022-07-22 13:00:37,476-INFO: Total iter: 4, epoch: 0, batch: 4, loss: [9.675454]
2022-07-22 13:00:37,607-INFO: Total iter: 5, epoch: 0, batch: 5, loss: [61.659126]
2022-07-22 13:00:37,731-INFO: Total iter: 6, epoch: 0, batch: 6, loss: [4.6202196e+27]
2022-07-22 13:00:37,859-INFO: Total iter: 7, epoch: 0, batch: 7, loss: [nan]
2022-07-22 13:00:37,979-INFO: Total iter: 8, epoch: 0, batch: 8, loss: [nan]
2022-07-22 13:00:38,101-INFO: Total iter: 9, epoch: 0, batch: 9, loss: [nan]
.....
2022-07-22 13:00:42,007-INFO: Total iter: 41, epoch: 0, batch: 41, loss: [nan]
configs/pp_liteseg/pp_liteseg_mine.yaml
:
Global:
reader_config: ../../../../714/save/config_paddleslim.yml
model_dir: ../../../../714/save/output
model_filename: model.pdmodel
params_filename: model.pdiparams
TrainConfig:
epochs: 14
logging_iter: 1
eval_iter: 90
learning_rate:
type: PiecewiseDecay
boundaries: [900]
values: [0.001, 0.0005]
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.0005
config_paddleslim.yml
:
train_dataset:
type: Dataset
# dataset_root: /root/share/program/save/data/portraint
# val_path: /root/share/program/save/data/portraint/txtfiles/test.txt
dataset_root: ../../../..
train_path: ../../../../test/test.txt
num_classes: 2
transforms:
- type: Resize
target_size: [224, 224]
- type: Normalize
mode: train
val_dataset:
type: Dataset
# dataset_root: /root/share/program/save/data/portraint
# val_path: /root/share/program/save/data/portraint/txtfiles/test.txt
dataset_root: ../../../..
val_path: ../../../../test/test.txt
num_classes: 2
transforms:
- type: Resize
target_size: [224, 224]
- type: Normalize
mode: val
export:
transforms:
- type: Resize
target_size: [224, 224]
- type: Normalize
# optimizer:
# type: sgd
# momentum: 0.9
# weight_decay: 5.0e-4
# lr_scheduler:
# type: PolynomialDecay
# learning_rate: 0.02 #finetune 0.02
# end_lr: 0
# power: 0.9
# warmup_iters: 100
# warmup_start_lr: 1.0e-5
# loss:
# types:
# - type: CrossEntropyLoss
# #min_kept: 1254400 # batch_size * 224 * 224 // 16
# - type: CrossEntropyLoss
# #min_kept: 1254400
# - type: CrossEntropyLoss
# #min_kept: 1254400
# - type: CrossEntropyLoss
# coef: [1, 1, 1, 1]
model:
type: PPLiteSeg
backbone:
type: STDC2
backbone_indices: [0, 1, 2, 3]
arm_out_chs: [16, 32, 32, 64]
seg_head_inter_chs: [8, 16, 16, 32]
# pretrained: output/pp_liteseg_stdc2_myhumanseg_modify_1_syntheses/iter_4200/model.pdparams
Could you please tell me the possible reason? Thanks!!
Please try to set the distillation node explicitly, or replace the distillation loss and try again. For details on how to set up the distillation node, please refer to this document. https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md#1%E8%87%AA%E5%8A%A8%E8%92%B8%E9%A6%8F%E6%95%88%E6%9E%9C%E4%B8%8D%E7%90%86%E6%83%B3%E6%80%8E%E4%B9%88%E8%87%AA%E4%B8%BB%E9%80%89%E6%8B%A9%E8%92%B8%E9%A6%8F%E8%8A%82%E7%82%B9