oneflow
oneflow copied to clipboard
add attr first_iter_when_persistent_workers
trafficstars
- [x] 修复https://github.com/Oneflow-Inc/OneTeam/issues/1674 中提到的问题
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9051/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 129.2ms (= 12924.6ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.0ms (= 14302.5ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.11 (= 143.0ms / 129.2ms)
OneFlow resnet50 time: 74.4ms (= 7438.7ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.9ms (= 8493.9ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.14 (= 84.9ms / 74.4ms)
OneFlow resnet50 time: 46.7ms (= 9340.6ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.0ms (= 11991.3ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.28 (= 60.0ms / 46.7ms)
OneFlow resnet50 time: 34.2ms (= 6833.6ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 41.3ms (= 8262.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.21 (= 41.3ms / 34.2ms)
OneFlow resnet50 time: 28.2ms (= 5632.2ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 34.9ms (= 6977.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.24 (= 34.9ms / 28.2ms)
OneFlow swin dataloader time: 0.259s (= 51.839s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.250s / 200, num_workers=1)
Relative speed: 0.584 (= 0.151s / 0.259s)
OneFlow swin dataloader time: 0.111s (= 22.214s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.203s / 200, num_workers=4)
Relative speed: 0.369 (= 0.041s / 0.111s)
OneFlow swin dataloader time: 0.040s (= 7.930s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.552s / 200, num_workers=8)
Relative speed: 0.574 (= 0.023s / 0.040s)
❌ OneFlow resnet50 time: 138.1ms (= 13812.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.0ms (= 16201.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 162.0ms / 138.1ms)
OneFlow resnet50 time: 84.7ms (= 8466.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.4ms (= 10336.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 103.4ms / 84.7ms)
OneFlow resnet50 time: 57.2ms (= 11436.6ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.4ms (= 15684.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 78.4ms / 57.2ms)
OneFlow resnet50 time: 44.2ms (= 8833.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.8ms (= 13963.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 69.8ms / 44.2ms)
OneFlow resnet50 time: 38.5ms (= 7702.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.8ms (= 13554.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.76 (= 67.8ms / 38.5ms)
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 129.6ms (= 12964.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 148.4ms (= 14843.9ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.14 (= 148.4ms / 129.6ms)
OneFlow resnet50 time: 74.9ms (= 7486.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.2ms (= 8418.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.12 (= 84.2ms / 74.9ms)
OneFlow resnet50 time: 47.4ms (= 9470.6ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.4ms (= 11488.4ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.21 (= 57.4ms / 47.4ms)
OneFlow resnet50 time: 34.9ms (= 6977.7ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 46.3ms (= 9254.2ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.33 (= 46.3ms / 34.9ms)
OneFlow resnet50 time: 30.0ms (= 5992.5ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.3ms (= 8459.3ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.41 (= 42.3ms / 30.0ms)
OneFlow swin dataloader time: 0.269s (= 53.771s / 200, num_workers=1)
PyTorch swin dataloader time: 0.150s (= 30.003s / 200, num_workers=1)
Relative speed: 0.558 (= 0.150s / 0.269s)
OneFlow swin dataloader time: 0.071s (= 14.184s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.317s / 200, num_workers=4)
Relative speed: 0.586 (= 0.042s / 0.071s)
OneFlow swin dataloader time: 0.043s (= 8.678s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.456s / 200, num_workers=8)
Relative speed: 0.514 (= 0.022s / 0.043s)
❌ OneFlow resnet50 time: 140.1ms (= 14006.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.0ms (= 16395.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 164.0ms / 140.1ms)
OneFlow resnet50 time: 86.7ms (= 8673.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.4ms (= 10244.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 102.4ms / 86.7ms)
OneFlow resnet50 time: 58.7ms (= 11747.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.5ms (= 15700.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 78.5ms / 58.7ms)
OneFlow resnet50 time: 45.0ms (= 9007.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 80.5ms (= 16100.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.79 (= 80.5ms / 45.0ms)
OneFlow resnet50 time: 40.2ms (= 8044.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.5ms (= 13709.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.70 (= 68.5ms / 40.2ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9051/
https://github.com/Oneflow-Inc/oneflow/pull/9246 后,此pr不再需要