AMDMIGraphX
AMDMIGraphX copied to clipboard
Dynamic Batch Model Testing and Debugging
Support for the models and current status can be seen at (AMD internal only): https://amdcloud-my.sharepoint.com/:x:/g/personal/charllin_amd_com/Ebf6_4jYgANDnx8_tUr3Y_4Bs4ULKjYLsNmQZWDiQOrO4w?e=v6OEYY&nav=MTVfezAwMDAwMDAwLTAwMDEtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMH0
Try the given "driver" commands with each model and see if they can compile without failing. This feature is a work in progress so we can expect to find fails due either to bugs or to incomplete implementation of dynamic batch sizing for specific ops.
Log the output from each driver run, with error messages.
Debug and fix the errors we find.
Be prepared to support QA by providing them with archive locations of the sample model files, as well as command-line arguments to run each one when they construct tests.
Need to add dynamic shape support to deconvolution op. for model 3dunet_kits19_128x128x128.onnx
The following model files can be found, for now, at /home/bpickrel/AMDMIGraphX/models/ on rocm-rome-6. I tested them with the given command lines:
- 3dunet_kits19_128x128x128.onnx
MIGRAPHX_TRACE_EVAL=yes bin/driver verify ../models/3dunet/model/3dunet_kits19_128x128x128.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim "@input" "[{min:1, max:4}, 1, 128, 128, 128]" >& ../models/logs/3dunet_kits19_128x128x128.errlog
(failed, see comment above) - bert_base_cased_1_fp16_gpu.onnx
bin/driver verify ../models/bert_base_cased_1_fp16_gpu.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim "@input_ids" "[{min:1, max:4}, 3]"
(failed) - resnet50-v1-7.onnx
bin/driver verify ../models/resnet50_v1.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim @data "[{min:1, max:4}, 3, 224, 224]"
(success) - distilgpt2_1_fp16_gpu.onnx
MIGRAPHX_TRACE_EVAL=1 bin/driver compile ../models/distilgpt2_1.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim @input_ids "[{min:1, max:4}, 128]" >& ../models/logs/distilgpt2_1.errlog
(failed) - distilgpt2_1.onnx
MIGRAPHX_TRACE_EVAL=1 bin/driver verify ../models/distilgpt2_1.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim "@input_ids" "[{min:1, max:4}, 128]" >& ../models/logs/distilgpt2_1.errlog
(failed) - yolov4.onnx
MIGRAPHX_TRACE_COMPILE=1 bin/driver compile ../models/yolov4.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim "@input_1:0" "[{min:1, max:4}, 416, 416, 3]" >& ../models/logs/yolov4.errlog
(failed) - inception_v2/model.onnx
bin/driver verify ../models/inception_v2/model.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim "@data_0" "[{min:1, max:4}, 3, 224, 224]"
(success; have not located inception_v3 model yet)
Todo: Need to make auto padding work with pooling for onnx taau-downsample model taau_low_res_downsample_d2s_for_infer_time_fp16_opset11.onnx. The amount of padding must be determined at runtime for a dynamic shape.
Use the Excel sheet at the above link for the correct command lines; these are now obsolete