oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

Add MaxUnpool op

Open marigoold opened this issue 3 years ago • 19 comments

marigoold avatar Oct 25 '22 09:10 marigoold

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.9ms (= 13988.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.4ms (= 16239.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 162.4ms / 139.9ms)

OneFlow resnet50 time: 87.0ms (= 8701.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.3ms (= 10434.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 104.3ms / 87.0ms)

OneFlow resnet50 time: 58.7ms (= 11736.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 87.6ms (= 17512.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.49 (= 87.6ms / 58.7ms)

OneFlow resnet50 time: 45.2ms (= 9048.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.7ms (= 15346.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.70 (= 76.7ms / 45.2ms)

OneFlow resnet50 time: 40.1ms (= 8015.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.4ms (= 13482.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.68 (= 67.4ms / 40.1ms)

github-actions[bot] avatar Oct 28 '22 08:10 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9309/

github-actions[bot] avatar Oct 28 '22 08:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.7ms (= 13968.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 162.4ms (= 16243.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 162.4ms / 139.7ms)

OneFlow resnet50 time: 84.7ms (= 8466.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.4ms (= 10141.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 101.4ms / 84.7ms)

OneFlow resnet50 time: 57.4ms (= 11482.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.6ms (= 15516.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 77.6ms / 57.4ms)

OneFlow resnet50 time: 44.8ms (= 8970.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 80.9ms (= 16176.3ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.80 (= 80.9ms / 44.8ms)

OneFlow resnet50 time: 42.1ms (= 8424.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.8ms (= 13753.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 68.8ms / 42.1ms)

github-actions[bot] avatar Oct 31 '22 06:10 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.6ms (= 13957.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.0ms (= 15997.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.0ms / 139.6ms)

OneFlow resnet50 time: 84.9ms (= 8485.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.2ms (= 10220.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 102.2ms / 84.9ms)

OneFlow resnet50 time: 57.4ms (= 11474.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.3ms (= 15657.7ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 78.3ms / 57.4ms)

OneFlow resnet50 time: 45.2ms (= 9034.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.3ms (= 14267.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 71.3ms / 45.2ms)

OneFlow resnet50 time: 40.9ms (= 8174.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.0ms (= 15201.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.86 (= 76.0ms / 40.9ms)

github-actions[bot] avatar Nov 01 '22 06:11 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.6ms (= 13959.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.4ms (= 16044.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.4ms / 139.6ms)

OneFlow resnet50 time: 84.7ms (= 8472.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.5ms (= 10145.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 101.5ms / 84.7ms)

OneFlow resnet50 time: 58.1ms (= 11617.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.7ms (= 15748.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 78.7ms / 58.1ms)

OneFlow resnet50 time: 44.7ms (= 8949.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.7ms (= 13530.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.51 (= 67.7ms / 44.7ms)

OneFlow resnet50 time: 40.5ms (= 8095.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.9ms (= 13573.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.68 (= 67.9ms / 40.5ms)

github-actions[bot] avatar Nov 01 '22 08:11 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9309/

github-actions[bot] avatar Nov 01 '22 08:11 github-actions[bot]

另外还可以加个global测试

mosout avatar Nov 07 '22 09:11 mosout

另外还可以加个global测试

已添加

marigoold avatar Nov 08 '22 12:11 marigoold

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.4ms (= 13943.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.2ms (= 16117.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.2ms / 139.4ms)

OneFlow resnet50 time: 85.3ms (= 8531.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 111.1ms (= 11115.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.30 (= 111.1ms / 85.3ms)

OneFlow resnet50 time: 57.3ms (= 11452.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 76.7ms (= 15340.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 76.7ms / 57.3ms)

OneFlow resnet50 time: 44.2ms (= 8839.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.1ms (= 13823.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.56 (= 69.1ms / 44.2ms)

OneFlow resnet50 time: 40.3ms (= 8053.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 63.7ms (= 12732.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 63.7ms / 40.3ms)

github-actions[bot] avatar Nov 08 '22 13:11 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9309/

github-actions[bot] avatar Nov 08 '22 13:11 github-actions[bot]

Speed stats:

github-actions[bot] avatar Nov 10 '22 07:11 github-actions[bot]

bfp16可以考虑补充下。

已支持

marigoold avatar Nov 10 '22 14:11 marigoold

可以加一下 profile 函数,参考这里 https://github.com/Oneflow-Inc/OneTeam/blob/master/tutorial/howto_test_user_op.md#%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95

已添加

marigoold avatar Nov 10 '22 14:11 marigoold

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions[bot] avatar Nov 10 '22 14:11 github-actions[bot]

CI failed when running job: Build cu102. PR label automerge has been removed

github-actions[bot] avatar Nov 10 '22 17:11 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.5ms (= 13949.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.0ms (= 16003.4ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.0ms / 139.5ms)

OneFlow resnet50 time: 85.3ms (= 8525.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.8ms (= 10183.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 101.8ms / 85.3ms)

OneFlow resnet50 time: 57.6ms (= 11511.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.2ms (= 15634.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 78.2ms / 57.6ms)

OneFlow resnet50 time: 45.4ms (= 9081.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.8ms (= 13968.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.54 (= 69.8ms / 45.4ms)

OneFlow resnet50 time: 39.2ms (= 7845.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.3ms (= 13667.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.74 (= 68.3ms / 39.2ms)

github-actions[bot] avatar Nov 11 '22 20:11 github-actions[bot]

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions[bot] avatar Nov 11 '22 20:11 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.5ms (= 13952.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.3ms (= 16129.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 161.3ms / 139.5ms)

OneFlow resnet50 time: 85.0ms (= 8496.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 106.6ms (= 10662.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.25 (= 106.6ms / 85.0ms)

OneFlow resnet50 time: 57.8ms (= 11557.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.9ms (= 15574.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 77.9ms / 57.8ms)

OneFlow resnet50 time: 44.1ms (= 8826.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.9ms (= 13976.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 69.9ms / 44.1ms)

OneFlow resnet50 time: 40.5ms (= 8097.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 74.2ms (= 14843.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.83 (= 74.2ms / 40.5ms)

github-actions[bot] avatar Nov 12 '22 14:11 github-actions[bot]

CI failed when running job: cuda-module. PR label automerge has been removed

github-actions[bot] avatar Nov 12 '22 14:11 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.7ms (= 13967.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.5ms (= 16053.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.5ms / 139.7ms)

OneFlow resnet50 time: 85.4ms (= 8538.8ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.5ms (= 10154.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.19 (= 101.5ms / 85.4ms)

OneFlow resnet50 time: 57.5ms (= 11509.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.5ms (= 15497.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.35 (= 77.5ms / 57.5ms)

OneFlow resnet50 time: 44.2ms (= 8847.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.0ms (= 14200.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.60 (= 71.0ms / 44.2ms)

OneFlow resnet50 time: 40.0ms (= 7990.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.3ms (= 14054.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.76 (= 70.3ms / 40.0ms)

github-actions[bot] avatar Nov 14 '22 06:11 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9309/

github-actions[bot] avatar Nov 14 '22 07:11 github-actions[bot]