one-yolov5 icon indicating copy to clipboard operation
one-yolov5 copied to clipboard

one-yolov5/classify/train.py 脚本 nsys 报告 【2023-03-29】

Open ccssu opened this issue 1 year ago • 0 comments

  • 引言
  • [one-yolo 测试结果]
    • [one-yolov5项目相关数据]
  • [one-yolo 详细测试数据]
  • [修复方案]
  • [ 资料集]

引言

对 one-yolov5/classify/train.py 跑了两份 nsys 报告 .

one-yolo_profile: 03-29-07-10profile.zip

torch-yolo_profile: torch_03-29-08-37profile.zip

one-yolo 测试结果

https://github.com/Oneflow-Inc/one-yolov5/blob/f1aaf236d05d46b5aea50bf4318edacbcd687b38/classify/train.py#L245

one-yolo torch-yolo
tloss这一行耗时 99ms 14ms

注意:

  • flow.version='0.9.1.dev20230327+cu117'
  • torch.version='1.13.0+cu117'
  • 均使用 float32训练·。
  • 启动指令均使用batch-size=256 , epochs = 6 , model = yolov5s-cls 模型
  • 机器 a100

结论:nsys分析看 tloss 这一行速度比较明显低于torch-yolo。如果优化速度将得到极大提升。

one-yolov5项目相关数据

项目地址: https://github.com/Oneflow-Inc/one-yolov5 数据集路径: @oneflow-25:/data/home/fengwen/imagenette160 权重路径: @oneflow-25:/data/home/fengwen/weight_v1_2_0

如果执行nsys产生报错
The target application terminated. One or more process it created re-parented.
Waiting for termination of re-parented processes.
Use the `--wait` option to modify this behavior.

请将 train.py中 check_git_status() 这一行注释

one-yolo 详细测试数据

one-yolov5启动指令
DATESTR=$(date +"%m-%d-%H-%M")
cd  ~/one-yolov5 
set -e 
# py-spy record -o profile.svg --native --
run_cmd="/usr/local/cuda/bin/nsys   profile -o runs/${DATESTR}profile python  \
    classify/train.py \
    --model runs/yolov5s-cls.pt \
    --data ../datasets/imagenette160   \
    --img 224  \
    --batch 256 \
    --epochs 6 \
    --project  One-YOLOv5_v_1_2_0_train \
    --name yolov5n-default \
    --multi_tensor_optimizer \
    --name yolov5n-default --lr0 0.1 --optimizer SGD "

echo ${run_cmd}
eval ${run_cmd}

one-yolo_profile 03-29-07-10profile.zip

image

torch-yolo_profile torch_03-29-08-37profile.zip image

修复方案

努力加载中。。。

资料集

  • https://github.com/Oneflow-Inc/oneflow/pull/9394
  • /data/home/fengwen/package/oneflow/.idea/make_flow.sh

image

ccssu avatar Mar 29 '23 09:03 ccssu