oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

Inplace masked fill

Open doombeaker opened this issue 3 years ago • 2 comments
trafficstars

image

doombeaker avatar Sep 22 '22 09:09 doombeaker

Code got formatted by CI. Please request CI again if you still want to have this PR merged. If the PR is from a forked repo, please download the patch files from the GitHub Actions web page and apply them locally.

github-actions[bot] avatar Sep 22 '22 09:09 github-actions[bot]

Speed stats:

github-actions[bot] avatar Sep 22 '22 15:09 github-actions[bot]

flow.masked_fill() 接口需要包一下

好的,我看 torch 里只有 torch.Tensor.masked_fill_, 所以这个 flow 下的接口,我也只导出,不添加文档了哈。

doombeaker avatar Sep 23 '22 01:09 doombeaker

Speed stats:

github-actions[bot] avatar Sep 23 '22 02:09 github-actions[bot]

CI failed when running job: cpu-module. PR label automerge has been removed

github-actions[bot] avatar Sep 23 '22 03:09 github-actions[bot]

Speed stats:

github-actions[bot] avatar Sep 23 '22 03:09 github-actions[bot]

Speed stats:

github-actions[bot] avatar Sep 23 '22 04:09 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9133/

github-actions[bot] avatar Sep 26 '22 01:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.6ms (= 13964.2ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.7ms (= 16069.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.7ms / 139.6ms)

OneFlow resnet50 time: 85.4ms (= 8543.9ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 102.7ms (= 10274.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 102.7ms / 85.4ms)

OneFlow resnet50 time: 58.2ms (= 11646.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.0ms (= 15596.0ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 78.0ms / 58.2ms)

OneFlow resnet50 time: 44.7ms (= 8943.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.0ms (= 15007.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.68 (= 75.0ms / 44.7ms)

OneFlow resnet50 time: 40.5ms (= 8094.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 67.8ms (= 13552.9ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.67 (= 67.8ms / 40.5ms)

github-actions[bot] avatar Sep 26 '22 01:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 140.6ms (= 14055.3ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 163.5ms (= 16352.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.16 (= 163.5ms / 140.6ms)

OneFlow resnet50 time: 85.9ms (= 8590.6ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 101.2ms (= 10120.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.18 (= 101.2ms / 85.9ms)

OneFlow resnet50 time: 58.3ms (= 11659.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 78.3ms (= 15653.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 78.3ms / 58.3ms)

OneFlow resnet50 time: 45.3ms (= 9061.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.4ms (= 14078.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.55 (= 70.4ms / 45.3ms)

OneFlow resnet50 time: 40.2ms (= 8041.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.7ms (= 15542.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.93 (= 77.7ms / 40.2ms)

github-actions[bot] avatar Sep 26 '22 07:09 github-actions[bot]

CI failed when running job: Build cu102. PR label automerge has been removed

github-actions[bot] avatar Sep 26 '22 14:09 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9133/

github-actions[bot] avatar Sep 27 '22 02:09 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 









❌ OneFlow resnet50 time: 139.6ms (= 13960.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.5ms (= 16049.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.5ms / 139.6ms)

OneFlow resnet50 time: 85.7ms (= 8566.7ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 104.4ms (= 10437.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.22 (= 104.4ms / 85.7ms)

OneFlow resnet50 time: 58.1ms (= 11614.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 87.8ms (= 17561.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.51 (= 87.8ms / 58.1ms)

OneFlow resnet50 time: 45.4ms (= 9087.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.8ms (= 14151.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.56 (= 70.8ms / 45.4ms)

OneFlow resnet50 time: 40.3ms (= 8050.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.8ms (= 13752.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.71 (= 68.8ms / 40.3ms)

github-actions[bot] avatar Sep 27 '22 02:09 github-actions[bot]