rknn-toolkit2 icon indicating copy to clipboard operation
rknn-toolkit2 copied to clipboard

我在转换一个 transformer onnx的时候一直有问题,不知道你们这个到底能不能成功转transformer到rknn

Open XiaBing992 opened this issue 11 months ago • 3 comments

报错: I rknn-toolkit2 version: 2.3.0 W config: Please make sure the model can be dynamic when enable 'config.dynamic_input'! I The 'dynamic_input' function has been enabled, the MaxShape is dynamic_input[0] = [[20, 1, 4096], [1, 1, 20, 64], [20, 1, 32, 2], [44, 1, 2, 128], [44, 1, 2, 128]]! The following functions are subject to the MaxShape: 1. The quantified dataset needs to be configured according to MaxShape 2. The eval_perf or eval_memory return the results of MaxShape I Loading : 100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00, 18.80it/s] W load_onnx: Please note that some float16/float64 data types in the model have been modified to float32! W load_onnx: The config.mean_values is None, zeros will be set for input 0! W load_onnx: The config.std_values is None, ones will be set for input 0! W load_onnx: The config.mean_values is None, zeros will be set for input 2! W load_onnx: The config.std_values is None, ones will be set for input 2! W load_onnx: The config.mean_values is None, zeros will be set for input 3! W load_onnx: The config.std_values is None, ones will be set for input 3! W load_onnx: The config.mean_values is None, zeros will be set for input 4! W load_onnx: The config.std_values is None, ones will be set for input 4! D base_optimize ... D base_optimize done. D D fold_constant ... D fold_constant done. D fold_constant remove nodes = ['/transformer_layer/mlp/Mul_1', '/transformer_layer/mlp/Mul', '/transformer_layer/mlp/Div', '/transformer_layer/mlp/Add', '/transformer_layer/mlp/Gather', '/transformer_layer/mlp/Shape', '/transformer_layer/self_attention/core_attention/Concat', '/transformer_layer/self_attention/core_attention/Unsqueeze_1', '/transformer_layer/self_attention/core_attention/Unsqueeze', '/transformer_layer/self_attention/core_attention/Gather_1', '/transformer_layer/self_attention/core_attention/Shape_2', '/transformer_layer/self_attention/core_attention/Gather', '/transformer_layer/self_attention/core_attention/Shape_1', '/transformer_layer/self_attention/core_attention/Sqrt_2', '/transformer_layer/self_attention/core_attention/Sqrt_1', '/transformer_layer/self_attention/core_attention/Cast_2', '/transformer_layer/self_attention/core_attention/Div', '/transformer_layer/self_attention/core_attention/Cast_1', '/transformer_layer/self_attention/core_attention/Sqrt', '/transformer_layer/self_attention/core_attention/Cast', '/transformer_layer/self_attention/core_attention/Slice', '/transformer_layer/self_attention/core_attention/Shape', '/transformer_layer/self_attention/Concat_6', '/transformer_layer/self_attention/Unsqueeze_11', '/transformer_layer/self_attention/Unsqueeze_10', '/transformer_layer/self_attention/Gather_9', '/transformer_layer/self_attention/Shape_9', '/transformer_layer/self_attention/Gather_8', '/transformer_layer/self_attention/Shape_8', '/transformer_layer/self_attention/Where_1', '/transformer_layer/self_attention/Equal_1', '/transformer_layer/self_attention/Mul_1', '/transformer_layer/self_attention/ConstantOfShape_1', '/transformer_layer/self_attention/Concat_5', '/transformer_layer/self_attention/Unsqueeze_8', '/transformer_layer/self_attention/Unsqueeze_7', '/transformer_layer/self_attention/Gather_7', '/transformer_layer/self_attention/Shape_7', '/transformer_layer/self_attention/Gather_6', '/transformer_layer/self_attention/Shape_6', '/transformer_layer/self_attention/Where', '/transformer_layer/self_attention/Equal', '/transformer_layer/self_attention/Mul', '/transformer_layer/self_attention/ConstantOfShape', 'Concat_344', 'Slice_342', 'Shape_338', 'Concat_321', 'Unsqueeze_319', 'Unsqueeze_315', 'Gather_313', 'Shape_311', 'Concat_309', 'Unsqueeze_307', 'Unsqueeze_305', 'Unsqueeze_302', 'Unsqueeze_298', 'Unsqueeze_291', 'Unsqueeze_286', 'Gather_282', 'Shape_280', 'Gather_279', 'Shape_277', 'Concat_274', 'Slice_272', 'Shape_268', 'Concat_251', 'Unsqueeze_249', 'Unsqueeze_245', 'Gather_243', 'Shape_241', 'Concat_239', 'Unsqueeze_237', 'Unsqueeze_235', 'Unsqueeze_232', 'Cast_230', 'Cast_229', 'Div_228', 'Unsqueeze_224', 'Unsqueeze_217', 'Unsqueeze_212', 'Mul_208', 'Squeeze_206', 'Slice_204', 'Shape_200', 'Gather_199', 'Shape_197', 'Gather_196', 'Shape_194', '/transformer_layer/self_attention/Concat_2', '/transformer_layer/self_attention/Unsqueeze_5', '/transformer_layer/self_attention/Unsqueeze_4', '/transformer_layer/self_attention/Gather_5', '/transformer_layer/self_attention/Shape_5', '/transformer_layer/self_attention/Gather_4', '/transformer_layer/self_attention/Shape_4', '/transformer_layer/self_attention/Concat_1', '/transformer_layer/self_attention/Unsqueeze_3', '/transformer_layer/self_attention/Unsqueeze_2', '/transformer_layer/self_attention/Gather_3', '/transformer_layer/self_attention/Shape_3', '/transformer_layer/self_attention/Gather_2', '/transformer_layer/self_attention/Shape_2', '/transformer_layer/self_attention/Concat', '/transformer_layer/self_attention/Unsqueeze_1', '/transformer_layer/self_attention/Unsqueeze', '/transformer_layer/self_attention/Gather_1', '/transformer_layer/self_attention/Shape_1', '/transformer_layer/self_attention/Gather', '/transformer_layer/self_attention/Shape'] D Fixed the shape information of some tensor! D D correct_ops ... D correct_ops done. D D fuse_ops ... D fuse_ops results: D remove_invalid_cast: remove node = ['/transformer_layer/input_layernorm/Cast', '/transformer_layer/input_layernorm/Cast_1', '/transformer_layer/input_layernorm/Cast_2'] D reduce_reshape_op_around_split: remove node = ['/transformer_layer/self_attention/Reshape', '/transformer_layer/self_attention/Reshape_1', '/transformer_layer/self_attention/Reshape_2', '/transformer_layer/self_attention/Split'], add node = ['/transformer_layer/self_attention/query_key_value/Add_output_0_rs', '/transformer_layer/self_attention/Split'] D remove_invalid_cast: remove node = ['/transformer_layer/Cast'] D remove_invalid_slice: remove node = ['Slice_226'] D swap_transpose_mul: remove node = ['/transformer_layer/self_attention/core_attention/Transpose', '/transformer_layer/self_attention/core_attention/Mul'], add node = ['/transformer_layer/self_attention/core_attention/Mul', '/transformer_layer/self_attention/core_attention/Transpose'] D remove_invalid_slice: remove node = ['Slice_300'] D swap_transpose_mul: remove node = ['/transformer_layer/self_attention/core_attention/Transpose_2', '/transformer_layer/self_attention/core_attention/Mul_1'], add node = ['/transformer_layer/self_attention/core_attention/Mul_1', '/transformer_layer/self_attention/core_attention/Transpose_2'] D remove_invalid_cast: remove node = ['/transformer_layer/self_attention/core_attention/Cast_3'] D swap_transpose_cast: remove node = ['/transformer_layer/self_attention/core_attention/Transpose_1', '/transformer_layer/self_attention/core_attention/Cast_4'], add node = ['/transformer_layer/self_attention/core_attention/Cast_4', '/transformer_layer/self_attention/core_attention/Transpose_1'] D remove_invalid_cast: remove node = ['/transformer_layer/post_attention_layernorm/Cast'] D replace_rms_norm: remove node = ['/transformer_layer/post_attention_layernorm/Pow', '/transformer_layer/post_attention_layernorm/ReduceMean', '/transformer_layer/post_attention_layernorm/Add', '/transformer_layer/post_attention_layernorm/Sqrt', '/transformer_layer/post_attention_layernorm/Div', '/transformer_layer/post_attention_layernorm/Mul', '/transformer_layer/post_attention_layernorm/Mul_1'], add node = ['/transformer_layer/Add_output_0_tp', '/transformer_layer/Add_output_0_tp_rs', '/transformer_layer/post_attention_layernorm/ReduceMean_2rmsn', '/transformer_layer/post_attention_layernorm/ReduceMean_2rmsn_rs', '/transformer_layer/post_attention_layernorm/ReduceMean_2rmsn_rs_tp'] D remove_invalid_cast: remove node = ['/transformer_layer/post_attention_layernorm/Cast_1'] D replace_parallel_slice_by_split: remove node = ['/transformer_layer/mlp/Slice', '/transformer_layer/mlp/Slice_1'], add node = ['/transformer_layer/mlp/Slice_2sp'] D remove_invalid_cast: remove node = ['/transformer_layer/Cast_1'] D replace_rms_norm: remove node = ['/transformer_layer/input_layernorm/Pow', '/transformer_layer/input_layernorm/ReduceMean', '/transformer_layer/input_layernorm/Add', '/transformer_layer/input_layernorm/Sqrt', '/transformer_layer/input_layernorm/Div', '/transformer_layer/input_layernorm/Mul', '/transformer_layer/input_layernorm/Mul_1'], add node = ['hidden_states_tp', 'hidden_states_tp_rs', '/transformer_layer/input_layernorm/ReduceMean_2rmsn', '/transformer_layer/input_layernorm/ReduceMean_2rmsn_rs', '/transformer_layer/input_layernorm/ReduceMean_2rmsn_rs_tp'] D replace_parallel_slice_by_split: remove node = ['Slice_214', 'Slice_220'], add node = ['Slice_214_2sp'] D replace_parallel_slice_by_split: remove node = ['Slice_288', 'Slice_294'], add node = ['Slice_288_2sp'] D fuse_matmul_softmax_matmul_to_sdpa: remove node = ['/transformer_layer/self_attention/core_attention/MatMul', '/transformer_layer/self_attention/core_attention/Add', '/transformer_layer/self_attention/core_attention/Softmax', '/transformer_layer/self_attention/core_attention/MatMul_1'], add node = ['/transformer_layer/self_attention/core_attention/Mul_output_0_tp', '/transformer_layer/self_attention/core_attention/Mul_output_0_tp_rs', '/transformer_layer/self_attention/core_attention/Mul_1_output_0_rs', '/transformer_layer/self_attention/core_attention/Cast_4_output_0_tp', '/transformer_layer/self_attention/core_attention/Cast_4_output_0_tp_rs', '/transformer_layer/self_attention/core_attention/Where_output_0_tp', '/transformer_layer/self_attention/core_attention/Where_output_0_tp_rs', '/transformer_layer/self_attention/core_attention/MatMul_1_2sdpa', '/transformer_layer/self_attention/core_attention/MatMul_1_output_0_sdpa_tp', '/transformer_layer/self_attention/core_attention/MatMul_1_output_0_sdpa_tp_rs'] D fuse_transpose_reshape: remove node = ['/transformer_layer/Add_output_0_tp'] D fuse_reshape_transpose: remove node = ['/transformer_layer/post_attention_layernorm/ReduceMean_2rmsn_rs_tp'] D replace_exswish: remove node = ['/transformer_layer/mlp/Sigmoid', '/transformer_layer/mlp/Mul_2'], add node = ['/transformer_layer/mlp/Sigmoid_2swish'] D fuse_transpose_reshape: remove node = ['hidden_states_tp'] D fuse_reshape_transpose: remove node = ['/transformer_layer/input_layernorm/ReduceMean_2rmsn_rs_tp'] D fuse_two_transpose: remove node = ['/transformer_layer/self_attention/core_attention/Transpose'] D fuse_transpose_reshape: remove node = ['/transformer_layer/self_attention/core_attention/Mul_1_output_0_rs'] D remove_invalid_cast: remove node = ['/transformer_layer/self_attention/core_attention/Cast_4'] D fuse_two_transpose: remove node = ['/transformer_layer/self_attention/core_attention/Transpose_1'] D fuse_transpose_reshape: remove node = ['/transformer_layer/self_attention/core_attention/Where_output_0_tp_rs'] D fuse_transpose_reshape_transpose: remove node = ['/transformer_layer/self_attention/core_attention/MatMul_1_output_0_sdpa_tp'] D fuse_transpose_reshape: remove node = ['/transformer_layer/self_attention/core_attention/Mul_output_0_tp_rs', '/transformer_layer/self_attention/core_attention/Cast_4_output_0_tp_rs'] D fuse_reshape_transpose: remove node = ['/transformer_layer/self_attention/core_attention/MatMul_1_output_0_sdpa_tp_rs'] D convert_matmul_to_exmatmul: remove node = ['/transformer_layer/self_attention/query_key_value/MatMul'], add node = ['/transformer_layer/input_layernorm/Mul_1_output_0_tp', '/transformer_layer/input_layernorm/Mul_1_output_0_tp_rs', '/transformer_layer/self_attention/query_key_value/MatMul', '/transformer_layer/self_attention/query_key_value/MatMul_output_0_mm_tp', '/transformer_layer/self_attention/query_key_value/MatMul_output_0_mm_tp_rs'] D unsqueeze_to_4d_add: remove node = [], add node = ['/transformer_layer/self_attention/query_key_value/MatMul_output_0_rs', '/transformer_layer/self_attention/query_key_value/Add_output_0-rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_253'], add node = ['Gather_253_2sl', 'Gather_253_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_256'], add node = ['Gather_256_2sl', 'Gather_256_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_323'], add node = ['Gather_323_2sl', 'Gather_323_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_326'], add node = ['Gather_326_2sl', 'Gather_326_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_254'], add node = ['Gather_254_2sl', 'Gather_254_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_257'], add node = ['Gather_257_2sl', 'Gather_257_2sl_rs'] D convert_unsqueeze_to_reshape: remove node = ['Unsqueeze_264'], add node = ['Unsqueeze_264_2rs'] D convert_unsqueeze_to_reshape: remove node = ['Unsqueeze_266'], add node = ['Unsqueeze_266_2rs'] D merge_dims_and_convert_ND_to_4D_concat: remove node = ['Concat_267'], add node = ['onnx::Concat_168_rs', 'onnx::Concat_170_rs', 'Concat_267_to4D', 'Concat_267_to4D_gather', 'Concat_267_to4D_gather_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_324'], add node = ['Gather_324_2sl', 'Gather_324_2sl_rs'] D convert_gather_to_slice_reshape: remove node = ['Gather_327'], add node = ['Gather_327_2sl', 'Gather_327_2sl_rs'] D convert_unsqueeze_to_reshape: remove node = ['Unsqueeze_334'], add node = ['Unsqueeze_334_2rs'] D convert_unsqueeze_to_reshape: remove node = ['Unsqueeze_336'], add node = ['Unsqueeze_336_2rs'] D merge_dims_and_convert_ND_to_4D_concat: remove node = ['Concat_337'], add node = ['onnx::Concat_249_rs', 'onnx::Concat_251_rs', 'Concat_337_to4D', 'Concat_337_to4D_gather', 'Concat_337_to4D_gather_rs'] D convert_unsqueeze_to_reshape: remove node = ['/transformer_layer/self_attention/Unsqueeze_6'], add node = ['/transformer_layer/self_attention/Unsqueeze_6_2rs'] D squeeze_nd_expand: remove node = [], add node = ['/transformer_layer/self_attention/Unsqueeze_6_output_0_rs', '/transformer_layer/self_attention/Expand_output_0-rs'] D convert_unsqueeze_to_reshape: remove node = ['/transformer_layer/self_attention/Unsqueeze_9'], add node = ['/transformer_layer/self_attention/Unsqueeze_9_2rs'] D squeeze_nd_expand: remove node = [], add node = ['/transformer_layer/self_attention/Unsqueeze_9_output_0_rs', '/transformer_layer/self_attention/Expand_1_output_0-rs'] D convert_matmul_to_exmatmul: remove node = ['/transformer_layer/self_attention/dense/MatMul'], add node = ['/transformer_layer/self_attention/core_attention/Reshape_output_0_tp', '/transformer_layer/self_attention/core_attention/Reshape_output_0_tp_rs', '/transformer_layer/self_attention/dense/MatMul', '/transformer_layer/self_attention/dense/MatMul_output_0_mm_tp', '/transformer_layer/self_attention/dense/MatMul_output_0_mm_tp_rs'] D unsqueeze_to_4d_add: remove node = [], add node = ['hidden_states_rs', '/transformer_layer/self_attention/dense/MatMul_output_0_rs', '/transformer_layer/Add_output_0-rs'] D convert_matmul_to_exmatmul: remove node = ['/transformer_layer/mlp/dense_h_to_4h/MatMul'], add node = ['/transformer_layer/post_attention_layernorm/Mul_1_output_0_tp', '/transformer_layer/post_attention_layernorm/Mul_1_output_0_tp_rs', '/transformer_layer/mlp/dense_h_to_4h/MatMul', '/transformer_layer/mlp/dense_h_to_4h/MatMul_output_0_mm_tp', '/transformer_layer/mlp/dense_h_to_4h/MatMul_output_0_mm_tp_rs'] D unsqueeze_to_4d_split: remove node = [], add node = ['/transformer_layer/mlp/dense_h_to_4h/MatMul_output_0_rs', '/transformer_layer/mlp/Slice_output_0-rs', '/transformer_layer/mlp/Slice_1_output_0-rs'] D unsqueeze_to_4d_swish: remove node = [], add node = ['/transformer_layer/mlp/Slice_output_0_rs', '/transformer_layer/mlp/Mul_2_output_0-rs'] D unsqueeze_to_4d_mul: remove node = [], add node = ['/transformer_layer/mlp/Mul_2_output_0_rs', '/transformer_layer/mlp/Slice_1_output_0_rs', '/transformer_layer/mlp/Mul_3_output_0-rs'] D convert_matmul_to_exmatmul: remove node = ['/transformer_layer/mlp/dense_4h_to_h/MatMul'], add node = ['/transformer_layer/mlp/Mul_3_output_0_tp', '/transformer_layer/mlp/Mul_3_output_0_tp_rs', '/transformer_layer/mlp/dense_4h_to_h/MatMul', '/transformer_layer/mlp/dense_4h_to_h/MatMul_output_0_mm_tp', '/transformer_layer/mlp/dense_4h_to_h/MatMul_output_0_mm_tp_rs'] D unsqueeze_to_4d_add: remove node = [], add node = ['/transformer_layer/Add_output_0_rs', '/transformer_layer/mlp/dense_4h_to_h/MatMul_output_0_rs', 'hidden_states_out-rs'] D unsqueeze_to_4d_transpose: remove node = [], add node = ['/transformer_layer/input_layernorm/Mul_1_output_0_rs', '/transformer_layer/input_layernorm/Mul_1_output_0_tp-rs'] D input_align_4D_add: remove node = ['/transformer_layer/self_attention/query_key_value/Add'], add node = ['/transformer_layer/self_attention/query_key_value/Add'] D unsqueeze_to_4d_transpose: remove node = [], add node = ['/transformer_layer/self_attention/core_attention/Reshape_output_0_rs', '/transformer_layer/self_attention/core_attention/Reshape_output_0_tp-rs'] D unsqueeze_to_4d_transpose: remove node = [], add node = ['/transformer_layer/post_attention_layernorm/Mul_1_output_0_rs', '/transformer_layer/post_attention_layernorm/Mul_1_output_0_tp-rs'] D unsqueeze_to_4d_transpose: remove node = [], add node = ['/transformer_layer/mlp/Mul_3_output_0_rs', '/transformer_layer/mlp/Mul_3_output_0_tp-rs'] D fuse_two_reshape: remove node = ['/transformer_layer/input_layernorm/ReduceMean_2rmsn_rs'] D bypass_two_reshape: remove node = ['/transformer_layer/input_layernorm/Mul_1_output_0_tp_rs', '/transformer_layer/input_layernorm/Mul_1_output_0_tp-rs'] D fuse_two_reshape: remove node = ['/transformer_layer/self_attention/query_key_value/MatMul_output_0_mm_tp_rs', '/transformer_layer/self_attention/query_key_value/Add_output_0-rs'] D replace_parallel_slice_by_split: remove node = ['Gather_253_2sl', 'Gather_256_2sl'], add node = ['Gather_256_2sl_2sp'] D replace_parallel_slice_by_split: remove node = ['Gather_323_2sl', 'Gather_326_2sl'], add node = ['Gather_326_2sl_2sp'] D replace_parallel_slice_by_split: remove node = ['Gather_254_2sl', 'Gather_257_2sl'], add node = ['Gather_257_2sl_2sp'] D bypass_two_reshape: remove node = ['onnx::Concat_168_rs', 'Unsqueeze_264_2rs', 'onnx::Concat_170_rs', 'Unsqueeze_266_2rs', 'Reshape_275', 'Concat_267_to4D_gather_rs'] D replace_parallel_slice_by_split: remove node = ['Gather_324_2sl', 'Gather_327_2sl'], add node = ['Gather_327_2sl_2sp'] D bypass_two_reshape: remove node = ['onnx::Concat_249_rs', 'Unsqueeze_334_2rs', 'onnx::Concat_251_rs', 'Unsqueeze_336_2rs', 'Reshape_345', 'Concat_337_to4D_gather_rs'] D fuse_two_reshape: remove node = ['/transformer_layer/self_attention/Unsqueeze_6_2rs', '/transformer_layer/self_attention/Expand_output_0-rs', '/transformer_layer/self_attention/Unsqueeze_9_2rs', '/transformer_layer/self_attention/Expand_1_output_0-rs', '/transformer_layer/self_attention/core_attention/Reshape'] D bypass_two_reshape: remove node = ['/transformer_layer/self_attention/core_attention/Reshape_output_0_tp_rs', '/transformer_layer/self_attention/core_attention/Reshape_output_0_tp-rs'] D fuse_two_reshape: remove node = ['/transformer_layer/self_attention/dense/MatMul_output_0_mm_tp_rs'] D bypass_two_reshape: remove node = ['/transformer_layer/Add_output_0_rs'] D fuse_two_reshape: remove node = ['/transformer_layer/post_attention_layernorm/ReduceMean_2rmsn_rs'] D bypass_two_reshape: remove node = ['/transformer_layer/post_attention_layernorm/Mul_1_output_0_tp_rs', '/transformer_layer/post_attention_layernorm/Mul_1_output_0_tp-rs'] D fuse_two_reshape: remove node = ['/transformer_layer/mlp/dense_h_to_4h/MatMul_output_0_mm_tp_rs'] D bypass_two_reshape: remove node = ['/transformer_layer/mlp/Slice_output_0_rs', '/transformer_layer/mlp/Slice_output_0-rs', '/transformer_layer/mlp/Mul_2_output_0_rs', '/transformer_layer/mlp/Mul_2_output_0-rs', '/transformer_layer/mlp/Slice_1_output_0_rs', '/transformer_layer/mlp/Slice_1_output_0-rs', '/transformer_layer/mlp/Mul_3_output_0_rs', '/transformer_layer/mlp/Mul_3_output_0-rs', '/transformer_layer/mlp/Mul_3_output_0_tp_rs', '/transformer_layer/mlp/Mul_3_output_0_tp-rs'] D fuse_two_reshape: remove node = ['/transformer_layer/mlp/dense_4h_to_h/MatMul_output_0_mm_tp_rs'] D fuse_reshape_transpose: remove node = ['/transformer_layer/input_layernorm/Mul_1_output_0_rs'] E build: Can not reshape from [20, 1, 32, 32, 2] to [10, 1, 32, 64] W build: ===================== WARN(10) ===================== E rknn-toolkit2 version: 2.3.0 Traceback (most recent call last): File "rknn/api/rknn_log.py", line 344, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper File "rknn/api/rknn_base.py", line 1962, in rknn.api.rknn_base.RKNNBase.build File "rknn/api/graph_optimizer.py", line 1976, in rknn.api.graph_optimizer.GraphOptimizer.fuse_ops File "rknn/api/rules/reduce.py", line 4162, in rknn.api.rules.reduce._p_reduce_reshape_op_around_axis_op File "rknn/api/rknn_utils.py", line 892, in rknn.api.rknn_utils.gen_gather_for_change_split_shape File "rknn/api/rknn_utils.py", line 844, in rknn.api.rknn_utils.get_least_common_shape File "rknn/api/rknn_utils.py", line 595, in rknn.api.rknn_utils.seperate_independent_part_for_reshape File "rknn/api/rknn_log.py", line 95, in rknn.api.rknn_log.RKNNLog.e ValueError: Can not reshape from [20, 1, 32, 32, 2] to [10, 1, 32, 64]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "export_transformer.py", line 62, in ret = model.build( File "/home/wxb/.conda/envs/rknn/lib/python3.8/site-packages/rknn/api/rknn.py", line 192, in build return self.rknn_base.build(do_quantization=do_quantization, dataset=dataset, expand_batch_size=rknn_batch_size) File "rknn/api/rknn_log.py", line 346, in rknn.api.rknn_log.error_catch_decorator.error_catch_wrapper File "rknn/api/rknn_log.py", line 210, in rknn.api.rknn_log.catch_error_in_traceback File "rknn/api/rknn_log.py", line 169, in rknn.api.rknn_log.cache_to_message IndexError: list index out of range

代码: import os import yaml import argparse from rknn.api import RKNN

def get_config(): parser = argparse.ArgumentParser() parser.add_argument("--verbose", default=True, help="rknntoolkit verbose") parser.add_argument("--config_path") parser.add_argument("--target_platform") args = parser.parse_args() return args

if name == "main": config = get_config() with open(config.config_path) as file: file_data = file.read() yaml_config = yaml.safe_load(file_data) print(yaml_config) model = RKNN(config.verbose)

# Config
seq_len = 20
sum_len = 64
last_len = sum_len - seq_len
dynamic_input = [
    [[seq_len, 1, 4096], [1, 1, seq_len, sum_len], [seq_len ,1,32,2], [last_len,1,2,128], [last_len,1,2,128]]
]
model.config(
    target_platform=config.target_platform,
    dynamic_input=dynamic_input,
    optimization_level=0
)

print('load ...')
# Load ONNX model
if yaml_config["outputs_nodes"] is None:
    ret = model.load_onnx(model=yaml_config["model_path"])
else:
    ret = model.load_onnx(
        model=yaml_config["model_path"],
        outputs=yaml_config["outputs_nodes"])
assert ret == 0, "Load model failed!"

print('build...')
# Build model
ret = model.build(
    do_quantization=yaml_config["do_quantization"])
assert ret == 0, "Build model failed!"

# Init Runtime
ret = model.init_runtime()
assert ret == 0, "Init runtime environment failed!"

# Export
if not os.path.exists(yaml_config["output_folder"]):
    os.mkdir(yaml_config["output_folder"])

name_list = os.path.basename(yaml_config["model_path"]).split(".")
model_base_name = ""
for name in name_list[0:-1]:
    model_base_name += name
model_device_name = config.target_platform.lower()
if yaml_config["do_quantization"]:
    model_save_name = model_base_name + "_" + model_device_name + "_quantized" + ".rknn"
else:
    model_save_name = model_base_name + "_" + model_device_name + "_unquantized" + ".rknn"
ret = model.export_rknn(
    os.path.join(yaml_config["output_folder"], model_save_name))
assert ret == 0, "Export rknn model failed!"
print("Export OK!")

XiaBing992 avatar Jan 10 '25 03:01 XiaBing992

  1. encoder 模型:我用 rknn-toolkit2 2.3.0 转过 vit 和类 bert 的 encoder 模型,encoder 转换应该是可行的。如果是 decoder,可以试试 RKLLM。
  2. 确定输入是否正确:这里报的错是 cannot reshape from [20, 1, 32, 32, 2] to [10, 1, 32, 64],确定是否是输入 shape 有误,因为这两个 shape size 的确是不同的大小。可以用 onnxruntime 推理一下结果,确定正确推理时输入的 shape
  3. 静态模型:文中转换的是动态 shape 模型,试试静态/固定 shape 是否可行

hebangwen avatar Jan 16 '25 04:01 hebangwen

  1. encoder 模型:我用 rknn-toolkit2 2.3.0 转过 vit 和类 bert 的 encoder 模型,encoder 转换应该是可行的。如果是 decoder,可以试试 RKLLM。
  2. 确定输入是否正确:这里报的错是 cannot reshape from [20, 1, 32, 32, 2] to [10, 1, 32, 64],确定是否是输入 shape 有误,因为这两个 shape size 的确是不同的大小。可以用 onnxruntime 推理一下结果,确定正确推理时输入的 shape
  3. 静态模型:文中转换的是动态 shape 模型,试试静态/固定 shape 是否可行
  1. rkllm是不是不支持直接加载onnx
  2. 我用onnx推理试过,是可以的,不知道rknn为什么会有这个错误出现
  3. 不行

XiaBing992 avatar Jan 16 '25 06:01 XiaBing992

rkllm 不支持 onnx,用的是 transformers 库的 save_pretrained 模型文件夹作为输入。

rknn-toolkit2/issues/226 里面也是这个问题,我感觉原因可能是 RKNN 内部广播不了。从报错信息来看,可能是 Gather、Split、Reshape 这几个节点有问题。

如果可以的话,可以发到我的邮箱帮你查看一下,aGViYW5nd2VuQG91dGxvb2suY29t (使用 base64 解码)。

PS:如果是企业客户,可以看看能不能注册 redmine 账号,rockchip 他们内部的 issue 平台,可以得到他们工程师的反馈。

hebangwen avatar Jan 16 '25 07:01 hebangwen