oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

GradAcc Mem V5: Part 0-4

Open chengtbf opened this issue 3 years ago • 6 comments
trafficstars

  • [x] Part 0 : Logical Chain 使用 LogicalChainPass 在 Job 层级进行内存复用的 chain merge 操作
  • [x] Part 1 : AfterGradAccChain ,将 GradAcc 之后的子图合并为一个 logical chain
  • [x] Part 2 : 将 AfterGradAccChain 与 First LogicalChain 合并,并插入特殊的控制 op: AccCtrlTick 控制两者的互斥访问
  • [x] Part 3 : Repeat Op 改为 Inplace 版本
  • [x] Part 4 : Acc 合并进 Backward 所在的 LogicalChain

chengtbf avatar Aug 19 '22 04:08 chengtbf

Speed stats:

github-actions[bot] avatar Sep 13 '22 04:09 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Sep 14 '22 08:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 130.3ms (= 13029.8ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.8ms (= 14578.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.12 (= 145.8ms / 130.3ms)

OneFlow resnet50 time: 78.1ms (= 7805.4ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 89.3ms (= 8933.3ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.14 (= 89.3ms / 78.1ms)

OneFlow resnet50 time: 49.5ms (= 9896.1ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 59.3ms (= 11853.0ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.20 (= 59.3ms / 49.5ms)

OneFlow resnet50 time: 36.1ms (= 7223.3ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 44.8ms (= 8959.0ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.24 (= 44.8ms / 36.1ms)

OneFlow resnet50 time: 30.9ms (= 6188.6ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 43.8ms (= 8761.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.42 (= 43.8ms / 30.9ms)

OneFlow swin dataloader time: 0.262s (= 52.477s / 200, num_workers=1)
PyTorch swin dataloader time: 0.152s (= 30.306s / 200, num_workers=1)
Relative speed: 0.578 (= 0.152s / 0.262s)

OneFlow swin dataloader time: 0.070s (= 13.958s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.118s / 200, num_workers=4)
Relative speed: 0.582 (= 0.041s / 0.070s)

OneFlow swin dataloader time: 0.039s (= 7.813s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.489s / 200, num_workers=8)
Relative speed: 0.575 (= 0.022s / 0.039s)

❌ OneFlow resnet50 time: 142.1ms (= 14214.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 166.7ms (= 16667.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 166.7ms / 142.1ms)

OneFlow resnet50 time: 88.6ms (= 8862.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 109.9ms (= 10991.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.24 (= 109.9ms / 88.6ms)

OneFlow resnet50 time: 60.0ms (= 11992.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15850.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 79.3ms / 60.0ms)

OneFlow resnet50 time: 46.2ms (= 9249.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.5ms (= 13909.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.50 (= 69.5ms / 46.2ms)

OneFlow resnet50 time: 40.5ms (= 8098.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.1ms (= 15021.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.85 (= 75.1ms / 40.5ms)

github-actions[bot] avatar Sep 14 '22 08:09 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Sep 14 '22 10:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 130.2ms (= 13018.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 145.3ms (= 14530.1ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.12 (= 145.3ms / 130.2ms)

OneFlow resnet50 time: 77.9ms (= 7787.9ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 88.1ms (= 8810.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 88.1ms / 77.9ms)

OneFlow resnet50 time: 49.3ms (= 9861.6ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 59.0ms (= 11791.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.20 (= 59.0ms / 49.3ms)

OneFlow resnet50 time: 36.0ms (= 7199.7ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 44.5ms (= 8908.8ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.24 (= 44.5ms / 36.0ms)

OneFlow resnet50 time: 30.6ms (= 6126.6ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 41.3ms (= 8269.7ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.35 (= 41.3ms / 30.6ms)

OneFlow swin dataloader time: 0.267s (= 53.458s / 200, num_workers=1)
PyTorch swin dataloader time: 0.151s (= 30.258s / 200, num_workers=1)
Relative speed: 0.566 (= 0.151s / 0.267s)

OneFlow swin dataloader time: 0.067s (= 13.380s / 200, num_workers=4)
PyTorch swin dataloader time: 0.041s (= 8.216s / 200, num_workers=4)
Relative speed: 0.614 (= 0.041s / 0.067s)

OneFlow swin dataloader time: 0.038s (= 7.630s / 200, num_workers=8)
PyTorch swin dataloader time: 0.023s (= 4.538s / 200, num_workers=8)
Relative speed: 0.595 (= 0.023s / 0.038s)

❌ OneFlow resnet50 time: 141.7ms (= 14171.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.9ms (= 16490.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 164.9ms / 141.7ms)

OneFlow resnet50 time: 88.6ms (= 8857.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.7ms (= 10468.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 104.7ms / 88.6ms)

OneFlow resnet50 time: 60.1ms (= 12016.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.3ms (= 15862.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.32 (= 79.3ms / 60.1ms)

OneFlow resnet50 time: 46.9ms (= 9379.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.5ms (= 13906.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.48 (= 69.5ms / 46.9ms)

OneFlow resnet50 time: 40.9ms (= 8181.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.5ms (= 14902.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 74.5ms / 40.9ms)

github-actions[bot] avatar Sep 14 '22 17:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 130.0ms (= 13001.9ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.2ms (= 14317.1ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.10 (= 143.2ms / 130.0ms)

OneFlow resnet50 time: 76.8ms (= 7677.2ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 89.1ms (= 8911.8ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.16 (= 89.1ms / 76.8ms)

OneFlow resnet50 time: 49.1ms (= 9828.8ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 56.1ms (= 11228.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.14 (= 56.1ms / 49.1ms)

OneFlow resnet50 time: 36.5ms (= 7309.9ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 42.2ms (= 8437.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.15 (= 42.2ms / 36.5ms)

OneFlow resnet50 time: 31.7ms (= 6341.9ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 38.3ms (= 7662.6ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.21 (= 38.3ms / 31.7ms)

OneFlow swin dataloader time: 0.264s (= 52.883s / 200, num_workers=1)
PyTorch swin dataloader time: 0.148s (= 29.542s / 200, num_workers=1)
Relative speed: 0.559 (= 0.148s / 0.264s)

OneFlow swin dataloader time: 0.071s (= 14.212s / 200, num_workers=4)
PyTorch swin dataloader time: 0.040s (= 7.958s / 200, num_workers=4)
Relative speed: 0.560 (= 0.040s / 0.071s)

OneFlow swin dataloader time: 0.040s (= 7.942s / 200, num_workers=8)
PyTorch swin dataloader time: 0.021s (= 4.279s / 200, num_workers=8)
Relative speed: 0.539 (= 0.021s / 0.040s)

❌ OneFlow resnet50 time: 140.9ms (= 14085.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 164.4ms (= 16436.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 164.4ms / 140.9ms)

OneFlow resnet50 time: 88.0ms (= 8797.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.6ms (= 10360.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 103.6ms / 88.0ms)

OneFlow resnet50 time: 60.4ms (= 12073.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.7ms (= 15735.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 78.7ms / 60.4ms)

OneFlow resnet50 time: 45.7ms (= 9144.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.0ms (= 15199.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.66 (= 76.0ms / 45.7ms)

OneFlow resnet50 time: 41.9ms (= 8373.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.2ms (= 15246.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.82 (= 76.2ms / 41.9ms)

github-actions[bot] avatar Sep 15 '22 13:09 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Oct 09 '22 10:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 140.2ms (= 14020.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.8ms (= 16277.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 162.8ms / 140.2ms)

OneFlow resnet50 time: 85.9ms (= 8591.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.4ms (= 10436.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.21 (= 104.4ms / 85.9ms)

OneFlow resnet50 time: 59.2ms (= 11839.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.8ms (= 15765.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.33 (= 78.8ms / 59.2ms)

OneFlow resnet50 time: 45.2ms (= 9041.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.3ms (= 14255.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 71.3ms / 45.2ms)

OneFlow resnet50 time: 40.3ms (= 8056.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.9ms (= 15380.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.91 (= 76.9ms / 40.3ms)

github-actions[bot] avatar Oct 09 '22 10:10 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Oct 11 '22 12:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.8ms (= 13984.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 158.5ms (= 15849.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.13 (= 158.5ms / 139.8ms)

OneFlow resnet50 time: 85.1ms (= 8510.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 111.3ms (= 11125.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.31 (= 111.3ms / 85.1ms)

OneFlow resnet50 time: 57.7ms (= 11547.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.9ms (= 15779.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 78.9ms / 57.7ms)

OneFlow resnet50 time: 44.6ms (= 8929.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 81.0ms (= 16202.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.81 (= 81.0ms / 44.6ms)

OneFlow resnet50 time: 40.3ms (= 8061.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.6ms (= 13721.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.70 (= 68.6ms / 40.3ms)

github-actions[bot] avatar Oct 11 '22 12:10 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Oct 17 '22 16:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.1ms (= 13909.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.8ms (= 16082.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 160.8ms / 139.1ms)

OneFlow resnet50 time: 84.7ms (= 8472.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.6ms (= 10163.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 101.6ms / 84.7ms)

OneFlow resnet50 time: 57.6ms (= 11526.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15630.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 78.2ms / 57.6ms)

OneFlow resnet50 time: 45.0ms (= 9001.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.8ms (= 13768.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.53 (= 68.8ms / 45.0ms)

OneFlow resnet50 time: 41.1ms (= 8220.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.5ms (= 13305.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.62 (= 66.5ms / 41.1ms)

github-actions[bot] avatar Oct 17 '22 16:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 140.0ms (= 14002.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.6ms (= 16258.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 162.6ms / 140.0ms)

OneFlow resnet50 time: 84.9ms (= 8490.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.0ms (= 10398.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 104.0ms / 84.9ms)

OneFlow resnet50 time: 57.5ms (= 11501.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 83.1ms (= 16613.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.44 (= 83.1ms / 57.5ms)

OneFlow resnet50 time: 44.4ms (= 8872.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.1ms (= 13818.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.56 (= 69.1ms / 44.4ms)

OneFlow resnet50 time: 39.0ms (= 7797.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 66.9ms (= 13370.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.71 (= 66.9ms / 39.0ms)

github-actions[bot] avatar Oct 28 '22 11:10 github-actions[bot]

CI failed when running job: cpu-module. PR label automerge has been removed

github-actions[bot] avatar Oct 28 '22 11:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.4ms (= 13941.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.4ms (= 16243.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 162.4ms / 139.4ms)

OneFlow resnet50 time: 84.6ms (= 8464.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 103.0ms (= 10296.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 103.0ms / 84.6ms)

OneFlow resnet50 time: 57.4ms (= 11489.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 88.7ms (= 17746.6ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 88.7ms / 57.4ms)

OneFlow resnet50 time: 45.4ms (= 9087.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.9ms (= 14187.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.56 (= 70.9ms / 45.4ms)

OneFlow resnet50 time: 39.5ms (= 7908.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.7ms (= 13736.6ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.74 (= 68.7ms / 39.5ms)

github-actions[bot] avatar Nov 04 '22 04:11 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Nov 04 '22 04:11 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.7ms (= 13965.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.9ms (= 16387.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 163.9ms / 139.7ms)

OneFlow resnet50 time: 84.7ms (= 8465.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.9ms (= 10189.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 101.9ms / 84.7ms)

OneFlow resnet50 time: 57.5ms (= 11504.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.5ms (= 15704.6ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.37 (= 78.5ms / 57.5ms)

OneFlow resnet50 time: 44.3ms (= 8869.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.5ms (= 15699.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.77 (= 78.5ms / 44.3ms)

OneFlow resnet50 time: 39.2ms (= 7849.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.1ms (= 13823.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.76 (= 69.1ms / 39.2ms)

github-actions[bot] avatar Nov 07 '22 11:11 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8961/

github-actions[bot] avatar Nov 07 '22 11:11 github-actions[bot]