mllm How to determine shadowLayers

I found that in modeling_phonelm_npu, shadowLayers is {0,1,3,4}, while in modeling_qwen_npu, shadowLayers is {1,2,26}. How are shadowlayers known?

Apr 03 '25 04:04 zyf-gh

The shadow layers is determined by the threshold during exporting the int8 pytorch model. The threshold controls the number of shadow layers. When the scale of activation after removing 0.01% of outliers * threshold is bigger than origin scale, then this layer is a shadow layer. You can refer to tools/convertor/profiling_activation/utils/get_input_output_scales.py for more details.

Apr 08 '25 06:04 oreomaker

I don't quite understand. I got the model_res_acc.json of the exported model. The following is part of the file. It contains multiple thresholds. How can I get shadowLayers based on this file?

        "32": {
            "t01m_thre": 32,
            "clip_input_num": 206,
            "no_clip_input_num": 19,
            "clip_output_num": 215,
            "no_clip_output_num": 10,
            "no_clip_input_name": [
                "model.layers.0.mlp.down_proj",
                "model.layers.1.mlp.down_proj",
                "model.layers.2.mlp.down_proj",
                "model.layers.3.mlp.down_proj",
                "model.layers.4.mlp.down_proj",
                "model.layers.6.mlp.down_proj",
                "model.layers.8.mlp.down_proj",
                "model.layers.18.mlp.down_proj",
                "model.layers.19.mlp.down_proj",
                "model.layers.21.mlp.down_proj",
                "model.layers.22.mlp.down_proj",
                "model.layers.23.mlp.down_proj",
                "model.layers.24.mlp.down_proj",
                "model.layers.25.mlp.down_proj",
                "model.layers.26.mlp.down_proj",
                "model.layers.27.mlp.down_proj",
                "model.layers.29.mlp.down_proj",
                "model.layers.30.mlp.down_proj",
                "model.layers.31.mlp.down_proj"
            ],
            "no_clip_output_name": [
                "model.layers.1.mlp.gate_proj",
                "model.layers.1.mlp.up_proj",
                "model.layers.1.mlp.down_proj",
                "model.layers.2.mlp.down_proj",
                "model.layers.3.self_attn.o_proj",
                "model.layers.4.mlp.down_proj",
                "model.layers.23.self_attn.o_proj",
                "model.layers.30.mlp.down_proj",
                "model.layers.31.self_attn.o_proj",
                "model.layers.31.mlp.down_proj"
            ],
            "res": 0.872
        },
        "64": {
            "t01m_thre": 64,
            "clip_input_num": 220,
            "no_clip_input_num": 5,
            "clip_output_num": 220,
            "no_clip_output_num": 5,
            "no_clip_input_name": [
                "model.layers.0.mlp.down_proj",
                "model.layers.1.mlp.down_proj",
                "model.layers.4.mlp.down_proj",
                "model.layers.8.mlp.down_proj",
                "model.layers.30.mlp.down_proj"
            ],
            "no_clip_output_name": [
                "model.layers.1.mlp.gate_proj",
                "model.layers.1.mlp.up_proj",
                "model.layers.1.mlp.down_proj",
                "model.layers.30.mlp.down_proj",
                "model.layers.31.self_attn.o_proj"
            ],
            "res": 0.815
        },
        "128": {
            "t01m_thre": 128,
            "clip_input_num": 223,
            "no_clip_input_num": 2,
            "clip_output_num": 222,
            "no_clip_output_num": 3,
            "no_clip_input_name": [
                "model.layers.1.mlp.down_proj",
                "model.layers.30.mlp.down_proj"
            ],
            "no_clip_output_name": [
                "model.layers.1.mlp.down_proj",
                "model.layers.30.mlp.down_proj",
                "model.layers.31.self_attn.o_proj"
            ],
            "res": 0.784
        },
        "152": {
            "t01m_thre": 152,
            "clip_input_num": 223,
            "no_clip_input_num": 2,
            "clip_output_num": 222,
            "no_clip_output_num": 3,
            "no_clip_input_name": [
                "model.layers.1.mlp.down_proj",
                "model.layers.30.mlp.down_proj"
            ],
            "no_clip_output_name": [
                "model.layers.1.mlp.down_proj",
                "model.layers.30.mlp.down_proj",
                "model.layers.31.self_attn.o_proj"
            ],
            "res": 0.784
        },
        "10000000": {
            "t01m_thre": 10000000,
            "clip_input_num": 225,
            "no_clip_input_num": 0,
            "clip_output_num": 225,
            "no_clip_output_num": 0,
            "no_clip_input_name": [],
            "no_clip_output_name": [],
            "res": 0.411
        }

Apr 09 '25 10:04 zyf-gh

Have you solved this problem? I meet the same question

Jul 09 '25 13:07 yangyyj