How to determine shadowLayers
I found that in modeling_phonelm_npu, shadowLayers is {0,1,3,4}, while in modeling_qwen_npu, shadowLayers is {1,2,26}. How are shadowlayers known?
The shadow layers is determined by the threshold during exporting the int8 pytorch model. The threshold controls the number of shadow layers. When the scale of activation after removing 0.01% of outliers * threshold is bigger than origin scale, then this layer is a shadow layer. You can refer to tools/convertor/profiling_activation/utils/get_input_output_scales.py for more details.
I don't quite understand. I got the model_res_acc.json of the exported model. The following is part of the file. It contains multiple thresholds. How can I get shadowLayers based on this file?
"32": {
"t01m_thre": 32,
"clip_input_num": 206,
"no_clip_input_num": 19,
"clip_output_num": 215,
"no_clip_output_num": 10,
"no_clip_input_name": [
"model.layers.0.mlp.down_proj",
"model.layers.1.mlp.down_proj",
"model.layers.2.mlp.down_proj",
"model.layers.3.mlp.down_proj",
"model.layers.4.mlp.down_proj",
"model.layers.6.mlp.down_proj",
"model.layers.8.mlp.down_proj",
"model.layers.18.mlp.down_proj",
"model.layers.19.mlp.down_proj",
"model.layers.21.mlp.down_proj",
"model.layers.22.mlp.down_proj",
"model.layers.23.mlp.down_proj",
"model.layers.24.mlp.down_proj",
"model.layers.25.mlp.down_proj",
"model.layers.26.mlp.down_proj",
"model.layers.27.mlp.down_proj",
"model.layers.29.mlp.down_proj",
"model.layers.30.mlp.down_proj",
"model.layers.31.mlp.down_proj"
],
"no_clip_output_name": [
"model.layers.1.mlp.gate_proj",
"model.layers.1.mlp.up_proj",
"model.layers.1.mlp.down_proj",
"model.layers.2.mlp.down_proj",
"model.layers.3.self_attn.o_proj",
"model.layers.4.mlp.down_proj",
"model.layers.23.self_attn.o_proj",
"model.layers.30.mlp.down_proj",
"model.layers.31.self_attn.o_proj",
"model.layers.31.mlp.down_proj"
],
"res": 0.872
},
"64": {
"t01m_thre": 64,
"clip_input_num": 220,
"no_clip_input_num": 5,
"clip_output_num": 220,
"no_clip_output_num": 5,
"no_clip_input_name": [
"model.layers.0.mlp.down_proj",
"model.layers.1.mlp.down_proj",
"model.layers.4.mlp.down_proj",
"model.layers.8.mlp.down_proj",
"model.layers.30.mlp.down_proj"
],
"no_clip_output_name": [
"model.layers.1.mlp.gate_proj",
"model.layers.1.mlp.up_proj",
"model.layers.1.mlp.down_proj",
"model.layers.30.mlp.down_proj",
"model.layers.31.self_attn.o_proj"
],
"res": 0.815
},
"128": {
"t01m_thre": 128,
"clip_input_num": 223,
"no_clip_input_num": 2,
"clip_output_num": 222,
"no_clip_output_num": 3,
"no_clip_input_name": [
"model.layers.1.mlp.down_proj",
"model.layers.30.mlp.down_proj"
],
"no_clip_output_name": [
"model.layers.1.mlp.down_proj",
"model.layers.30.mlp.down_proj",
"model.layers.31.self_attn.o_proj"
],
"res": 0.784
},
"152": {
"t01m_thre": 152,
"clip_input_num": 223,
"no_clip_input_num": 2,
"clip_output_num": 222,
"no_clip_output_num": 3,
"no_clip_input_name": [
"model.layers.1.mlp.down_proj",
"model.layers.30.mlp.down_proj"
],
"no_clip_output_name": [
"model.layers.1.mlp.down_proj",
"model.layers.30.mlp.down_proj",
"model.layers.31.self_attn.o_proj"
],
"res": 0.784
},
"10000000": {
"t01m_thre": 10000000,
"clip_input_num": 225,
"no_clip_input_num": 0,
"clip_output_num": 225,
"no_clip_output_num": 0,
"no_clip_input_name": [],
"no_clip_output_name": [],
"res": 0.411
}
Have you solved this problem? I meet the same question