oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

[bugfix]fix bug of oneflow backend be stuck

Open crazy-JiangDongHua opened this issue 1 year ago • 4 comments

修复了 torch compile 对接 oneflow backend,跑 resnet50 会卡死问题

crazy-JiangDongHua avatar Feb 09 '24 05:02 crazy-JiangDongHua

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Feb 09 '24 05:02 CLAassistant

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions[bot] avatar Feb 21 '24 04:02 github-actions[bot]

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10435/

github-actions[bot] avatar Feb 21 '24 06:02 github-actions[bot]

Speed stats:
GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.7ms (= 4369.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.5ms (= 5751.0ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.32 (= 57.5ms / 43.7ms)

OneFlow resnet50 time: 26.6ms (= 2657.5ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 37.3ms (= 3734.5ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.41 (= 37.3ms / 26.6ms)

OneFlow resnet50 time: 20.0ms (= 3996.6ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 34.8ms (= 6959.6ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.74 (= 34.8ms / 20.0ms)

OneFlow resnet50 time: 17.4ms (= 3477.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.1ms (= 6222.0ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.79 (= 31.1ms / 17.4ms)

OneFlow resnet50 time: 17.5ms (= 3495.4ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.4ms (= 5877.6ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.68 (= 29.4ms / 17.5ms)

OneFlow swin dataloader time: 0.200s (= 39.940s / 200, num_workers=1)
PyTorch swin dataloader time: 0.129s (= 25.731s / 200, num_workers=1)
Relative speed: 0.644 (= 0.129s / 0.200s)

OneFlow swin dataloader time: 0.055s (= 10.904s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.523s / 200, num_workers=4)
Relative speed: 0.598 (= 0.033s / 0.055s)

OneFlow swin dataloader time: 0.030s (= 5.942s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.348s / 200, num_workers=8)
Relative speed: 0.563 (= 0.017s / 0.030s)

❌ OneFlow resnet50 time: 49.2ms (= 4917.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 65.6ms (= 6561.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.33 (= 65.6ms / 49.2ms)

OneFlow resnet50 time: 35.8ms (= 3585.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 46.1ms (= 4612.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 46.1ms / 35.8ms)

OneFlow resnet50 time: 28.0ms (= 5607.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.6ms (= 8117.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.45 (= 40.6ms / 28.0ms)

OneFlow resnet50 time: 25.0ms (= 4990.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 38.4ms (= 7686.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 38.4ms / 25.0ms)

OneFlow resnet50 time: 24.0ms (= 4805.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 37.0ms (= 7395.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 37.0ms / 24.0ms)

github-actions[bot] avatar Feb 21 '24 06:02 github-actions[bot]

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions[bot] avatar Feb 28 '24 19:02 github-actions[bot]

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions[bot] avatar Feb 29 '24 02:02 github-actions[bot]

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10435/

github-actions[bot] avatar Feb 29 '24 03:02 github-actions[bot]

Speed stats:
GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.9ms (= 4388.3ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 57.0ms (= 5700.3ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.30 (= 57.0ms / 43.9ms)

OneFlow resnet50 time: 26.5ms (= 2650.2ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 38.9ms (= 3892.8ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.47 (= 38.9ms / 26.5ms)

OneFlow resnet50 time: 18.3ms (= 3656.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 34.5ms (= 6892.0ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.88 (= 34.5ms / 18.3ms)

OneFlow resnet50 time: 17.6ms (= 3522.9ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 29.5ms (= 5903.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.68 (= 29.5ms / 17.6ms)

OneFlow resnet50 time: 16.1ms (= 3226.2ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 31.4ms (= 6283.2ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.95 (= 31.4ms / 16.1ms)

OneFlow swin dataloader time: 0.200s (= 39.987s / 200, num_workers=1)
PyTorch swin dataloader time: 0.128s (= 25.508s / 200, num_workers=1)
Relative speed: 0.638 (= 0.128s / 0.200s)

OneFlow swin dataloader time: 0.054s (= 10.831s / 200, num_workers=4)
PyTorch swin dataloader time: 0.032s (= 6.395s / 200, num_workers=4)
Relative speed: 0.590 (= 0.032s / 0.054s)

OneFlow swin dataloader time: 0.030s (= 6.062s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.382s / 200, num_workers=8)
Relative speed: 0.558 (= 0.017s / 0.030s)

❌ OneFlow resnet50 time: 49.2ms (= 4918.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 64.8ms (= 6477.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 64.8ms / 49.2ms)

OneFlow resnet50 time: 36.2ms (= 3624.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 44.9ms (= 4492.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.24 (= 44.9ms / 36.2ms)

OneFlow resnet50 time: 28.5ms (= 5691.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.7ms (= 7940.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 39.7ms / 28.5ms)

OneFlow resnet50 time: 25.0ms (= 4995.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.1ms (= 7815.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.56 (= 39.1ms / 25.0ms)

OneFlow resnet50 time: 24.0ms (= 4791.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 36.0ms (= 7200.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.50 (= 36.0ms / 24.0ms)

github-actions[bot] avatar Feb 29 '24 03:02 github-actions[bot]

View latest API docs preview at: https://oneflow-staging.oss-cn-beijing.aliyuncs.com/docs/Oneflow-Inc/oneflow/pr/10435/

github-actions[bot] avatar Feb 29 '24 06:02 github-actions[bot]

Speed stats:
GPU Name: NVIDIA GeForce RTX 3080 Ti 

❌ OneFlow resnet50 time: 43.7ms (= 4367.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 58.3ms (= 5827.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.33 (= 58.3ms / 43.7ms)

OneFlow resnet50 time: 26.2ms (= 2621.1ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 37.5ms (= 3752.2ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.43 (= 37.5ms / 26.2ms)

OneFlow resnet50 time: 18.3ms (= 3666.1ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 36.1ms (= 7218.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.97 (= 36.1ms / 18.3ms)

OneFlow resnet50 time: 18.4ms (= 3683.4ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 31.2ms (= 6243.1ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.69 (= 31.2ms / 18.4ms)

OneFlow resnet50 time: 16.5ms (= 3308.9ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 29.5ms (= 5902.8ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.78 (= 29.5ms / 16.5ms)

OneFlow swin dataloader time: 0.199s (= 39.873s / 200, num_workers=1)
PyTorch swin dataloader time: 0.128s (= 25.656s / 200, num_workers=1)
Relative speed: 0.643 (= 0.128s / 0.199s)

OneFlow swin dataloader time: 0.056s (= 11.135s / 200, num_workers=4)
PyTorch swin dataloader time: 0.033s (= 6.636s / 200, num_workers=4)
Relative speed: 0.596 (= 0.033s / 0.056s)

OneFlow swin dataloader time: 0.033s (= 6.669s / 200, num_workers=8)
PyTorch swin dataloader time: 0.017s (= 3.356s / 200, num_workers=8)
Relative speed: 0.503 (= 0.017s / 0.033s)

❌ OneFlow resnet50 time: 49.0ms (= 4901.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.2ms (= 6618.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 66.2ms / 49.0ms)

OneFlow resnet50 time: 36.6ms (= 3658.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 45.2ms (= 4517.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.23 (= 45.2ms / 36.6ms)

OneFlow resnet50 time: 28.1ms (= 5626.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 40.1ms (= 8021.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.43 (= 40.1ms / 28.1ms)

OneFlow resnet50 time: 24.8ms (= 4959.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 39.5ms (= 7900.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.59 (= 39.5ms / 24.8ms)

OneFlow resnet50 time: 24.1ms (= 4819.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 37.3ms (= 7454.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.55 (= 37.3ms / 24.1ms)

github-actions[bot] avatar Feb 29 '24 07:02 github-actions[bot]