add generic_ctor to pass dtype for type tensor
#close https://github.com/Oneflow-Inc/oneflow/issues/9265
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.3ms (= 13934.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.5ms (= 16250.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 162.5ms / 139.3ms)
OneFlow resnet50 time: 85.0ms (= 8501.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 112.1ms (= 11213.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 112.1ms / 85.0ms)
OneFlow resnet50 time: 57.3ms (= 11465.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.5ms (= 15490.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 77.5ms / 57.3ms)
OneFlow resnet50 time: 43.8ms (= 8755.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.1ms (= 14218.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.62 (= 71.1ms / 43.8ms)
OneFlow resnet50 time: 40.4ms (= 8072.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.5ms (= 13296.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.65 (= 66.5ms / 40.4ms)
CI failed when running job: cuda-misc. PR label automerge has been removed
Speed stats:
GPU Name: NVIDIA GeForce GTX 1080
❌ OneFlow resnet50 time: 148.0ms (= 14803.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 170.6ms (= 17059.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 170.6ms / 148.0ms)
OneFlow resnet50 time: 96.1ms (= 9605.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 113.3ms (= 11329.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 113.3ms / 96.1ms)
OneFlow resnet50 time: 68.1ms (= 13629.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 89.4ms (= 17880.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.31 (= 89.4ms / 68.1ms)
OneFlow resnet50 time: 59.0ms (= 11799.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.1ms (= 15221.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.29 (= 76.1ms / 59.0ms)
OneFlow resnet50 time: 54.9ms (= 10972.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.2ms (= 14233.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 71.2ms / 54.9ms)
CI failed when running job: cuda-speed-test. PR label automerge has been removed
Speed stats:
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.5ms (= 13946.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.6ms (= 16161.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.6ms / 139.5ms)
OneFlow resnet50 time: 85.0ms (= 8500.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.2ms (= 10119.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 101.2ms / 85.0ms)
OneFlow resnet50 time: 57.6ms (= 11521.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.0ms (= 15605.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 78.0ms / 57.6ms)
OneFlow resnet50 time: 44.3ms (= 8852.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.5ms (= 13904.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.57 (= 69.5ms / 44.3ms)
OneFlow resnet50 time: 38.7ms (= 7734.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.8ms (= 13751.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.78 (= 68.8ms / 38.7ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.4ms (= 13935.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.3ms (= 16127.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.3ms / 139.4ms)
OneFlow resnet50 time: 85.1ms (= 8514.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.7ms (= 10167.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 101.7ms / 85.1ms)
OneFlow resnet50 time: 57.7ms (= 11536.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.2ms (= 15432.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 77.2ms / 57.7ms)
OneFlow resnet50 time: 44.6ms (= 8916.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 73.8ms (= 14764.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.66 (= 73.8ms / 44.6ms)
OneFlow resnet50 time: 40.3ms (= 8065.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.5ms (= 14302.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.77 (= 71.5ms / 40.3ms)
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.9ms (= 13990.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 166.2ms (= 16619.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 166.2ms / 139.9ms)
OneFlow resnet50 time: 85.7ms (= 8568.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.5ms (= 10247.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 102.5ms / 85.7ms)
OneFlow resnet50 time: 57.9ms (= 11578.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 86.5ms (= 17293.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.49 (= 86.5ms / 57.9ms)
OneFlow resnet50 time: 44.7ms (= 8936.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.4ms (= 14089.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 70.4ms / 44.7ms)
OneFlow resnet50 time: 39.6ms (= 7923.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.1ms (= 14227.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.80 (= 71.1ms / 39.6ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 144.7ms (= 14465.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 165.4ms (= 16541.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.14 (= 165.4ms / 144.7ms)
OneFlow resnet50 time: 85.7ms (= 8570.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.7ms (= 10173.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 101.7ms / 85.7ms)
OneFlow resnet50 time: 58.2ms (= 11645.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.7ms (= 15133.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 75.7ms / 58.2ms)
OneFlow resnet50 time: 46.3ms (= 9255.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.5ms (= 14091.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.52 (= 70.5ms / 46.3ms)
OneFlow resnet50 time: 39.8ms (= 7967.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.5ms (= 13709.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.72 (= 68.5ms / 39.8ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.4ms (= 13936.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.7ms (= 16173.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.7ms / 139.4ms)
OneFlow resnet50 time: 85.1ms (= 8507.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 111.3ms (= 11125.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.31 (= 111.3ms / 85.1ms)
OneFlow resnet50 time: 57.6ms (= 11529.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.5ms (= 15500.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 77.5ms / 57.6ms)
OneFlow resnet50 time: 44.4ms (= 8876.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15862.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.79 (= 79.3ms / 44.4ms)
OneFlow resnet50 time: 40.0ms (= 8007.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.5ms (= 13309.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.66 (= 66.5ms / 40.0ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 141.8ms (= 14178.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 165.3ms (= 16526.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 165.3ms / 141.8ms)
OneFlow resnet50 time: 86.3ms (= 8628.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.4ms (= 10341.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 103.4ms / 86.3ms)
OneFlow resnet50 time: 58.1ms (= 11620.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15868.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 79.3ms / 58.1ms)
OneFlow resnet50 time: 45.3ms (= 9068.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.8ms (= 13954.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 69.8ms / 45.3ms)
OneFlow resnet50 time: 39.7ms (= 7941.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15648.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.97 (= 78.2ms / 39.7ms)
Speed stats:
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 141.1ms (= 14109.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.7ms (= 16367.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 163.7ms / 141.1ms)
OneFlow resnet50 time: 85.5ms (= 8552.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.3ms (= 10127.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 101.3ms / 85.5ms)
OneFlow resnet50 time: 57.8ms (= 11553.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.6ms (= 15722.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 78.6ms / 57.8ms)
OneFlow resnet50 time: 44.3ms (= 8869.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 82.4ms (= 16483.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.86 (= 82.4ms / 44.3ms)
OneFlow resnet50 time: 41.1ms (= 8213.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.8ms (= 13957.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.70 (= 69.8ms / 41.1ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 141.2ms (= 14123.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 142.4ms (= 14238.0ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.01 (= 142.4ms / 141.2ms)
OneFlow resnet50 time: 82.7ms (= 8265.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 88.0ms (= 8797.6ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 88.0ms / 82.7ms)
OneFlow resnet50 time: 50.8ms (= 10163.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 62.1ms (= 12421.2ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.22 (= 62.1ms / 50.8ms)
OneFlow resnet50 time: 34.0ms (= 6792.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 42.9ms (= 8586.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.26 (= 42.9ms / 34.0ms)
OneFlow resnet50 time: 26.2ms (= 5231.7ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 43.1ms (= 8619.4ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.65 (= 43.1ms / 26.2ms)
OneFlow swin dataloader time: 0.238s (= 47.551s / 200, num_workers=1)
PyTorch swin dataloader time: 0.150s (= 30.036s / 200, num_workers=1)
Relative speed: 0.632 (= 0.150s / 0.238s)
OneFlow swin dataloader time: 0.067s (= 13.335s / 200, num_workers=4)
PyTorch swin dataloader time: 0.043s (= 8.571s / 200, num_workers=4)
Relative speed: 0.643 (= 0.043s / 0.067s)
OneFlow swin dataloader time: 0.046s (= 9.164s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.595s / 200, num_workers=8)
Relative speed: 0.501 (= 0.023s / 0.046s)
❌ OneFlow resnet50 time: 153.0ms (= 15300.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.7ms (= 16471.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.08 (= 164.7ms / 153.0ms)
OneFlow resnet50 time: 93.2ms (= 9324.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.2ms (= 10419.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.12 (= 104.2ms / 93.2ms)
OneFlow resnet50 time: 61.0ms (= 12195.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.0ms (= 15796.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 79.0ms / 61.0ms)
OneFlow resnet50 time: 42.9ms (= 8572.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.8ms (= 13955.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 69.8ms / 42.9ms)
OneFlow resnet50 time: 35.9ms (= 7176.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.8ms (= 13755.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.92 (= 68.8ms / 35.9ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 141.3ms (= 14134.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.9ms (= 14393.8ms / 100, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.02 (= 143.9ms / 141.3ms)
OneFlow resnet50 time: 82.5ms (= 8253.5ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 86.2ms (= 8616.7ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.04 (= 86.2ms / 82.5ms)
OneFlow resnet50 time: 51.4ms (= 10271.9ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 60.4ms (= 12087.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.18 (= 60.4ms / 51.4ms)
OneFlow resnet50 time: 33.7ms (= 6730.2ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 44.3ms (= 8863.1ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.32 (= 44.3ms / 33.7ms)
OneFlow resnet50 time: 27.0ms (= 5401.2ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 39.2ms (= 7830.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.45 (= 39.2ms / 27.0ms)
OneFlow swin dataloader time: 0.235s (= 47.096s / 200, num_workers=1)
PyTorch swin dataloader time: 0.152s (= 30.464s / 200, num_workers=1)
Relative speed: 0.647 (= 0.152s / 0.235s)
OneFlow swin dataloader time: 0.070s (= 13.913s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.372s / 200, num_workers=4)
Relative speed: 0.602 (= 0.042s / 0.070s)
OneFlow swin dataloader time: 0.039s (= 7.819s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.471s / 200, num_workers=8)
Relative speed: 0.572 (= 0.022s / 0.039s)
❌ OneFlow resnet50 time: 153.2ms (= 15323.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 165.0ms (= 16495.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 1.08 (= 165.0ms / 153.2ms)
OneFlow resnet50 time: 93.2ms (= 9317.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.4ms (= 10239.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.10 (= 102.4ms / 93.2ms)
OneFlow resnet50 time: 61.0ms (= 12197.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.4ms (= 15479.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 77.4ms / 61.0ms)
OneFlow resnet50 time: 43.3ms (= 8663.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.3ms (= 14464.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.67 (= 72.3ms / 43.3ms)
OneFlow resnet50 time: 35.7ms (= 7142.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.5ms (= 13506.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.89 (= 67.5ms / 35.7ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9277/