oneflow icon indicating copy to clipboard operation
oneflow copied to clipboard

List styled ErrorFrame

Open lixinqi opened this issue 3 years ago • 5 comments

将vector形式的StackedError重构成list形式的ErrorFrame。杜绝某些极端情况下显示错误和微量的内存泄漏。

lixinqi avatar Jul 17 '22 18:07 lixinqi

Speed stats:
GPU Name: NVIDIA GeForce GTX 1080 

❌ OneFlow resnet50 time: 129.4ms (= 12940.4ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.7ms (= 14374.7ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.11 (= 143.7ms / 129.4ms)

OneFlow resnet50 time: 75.8ms (= 7578.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 85.8ms (= 8576.9ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 85.8ms / 75.8ms)

OneFlow resnet50 time: 48.2ms (= 9645.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 59.0ms (= 11798.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.22 (= 59.0ms / 48.2ms)

OneFlow resnet50 time: 39.2ms (= 7832.4ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.5ms (= 9100.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.16 (= 45.5ms / 39.2ms)

OneFlow resnet50 time: 32.7ms (= 6533.8ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.3ms (= 7452.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.14 (= 37.3ms / 32.7ms)

OneFlow swin dataloader time: 0.282s (= 56.410s / 200, num_workers=1)
PyTorch swin dataloader time: 0.150s (= 30.023s / 200, num_workers=1)
Relative speed: 0.532 (= 0.150s / 0.282s)

OneFlow swin dataloader time: 0.083s (= 16.541s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.471s / 200, num_workers=4)
Relative speed: 0.512 (= 0.042s / 0.083s)

OneFlow swin dataloader time: 0.045s (= 8.993s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.486s / 200, num_workers=8)
Relative speed: 0.499 (= 0.022s / 0.045s)

❌ OneFlow resnet50 time: 145.3ms (= 14529.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 169.9ms (= 16991.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 169.9ms / 145.3ms)

OneFlow resnet50 time: 94.0ms (= 9395.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 112.6ms (= 11255.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 112.6ms / 94.0ms)

OneFlow resnet50 time: 69.1ms (= 13827.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 95.7ms (= 19146.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.38 (= 95.7ms / 69.1ms)

OneFlow resnet50 time: 56.4ms (= 11283.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.2ms (= 15049.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.33 (= 75.2ms / 56.4ms)

OneFlow resnet50 time: 51.2ms (= 10233.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.7ms (= 13949.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 69.7ms / 51.2ms)

github-actions[bot] avatar Jul 18 '22 05:07 github-actions[bot]

View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8674/

github-actions[bot] avatar Aug 25 '22 11:08 github-actions[bot]

Speed stats:
GPU Name: GeForce GTX 1080 

❌ OneFlow resnet50 time: 129.3ms (= 12931.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 140.4ms (= 14041.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 140.4ms / 129.3ms)

OneFlow resnet50 time: 74.4ms (= 7441.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.4ms (= 8438.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 84.4ms / 74.4ms)

OneFlow resnet50 time: 46.9ms (= 9373.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 55.6ms (= 11123.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.19 (= 55.6ms / 46.9ms)

OneFlow resnet50 time: 34.4ms (= 6871.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 43.9ms (= 8779.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.28 (= 43.9ms / 34.4ms)

OneFlow resnet50 time: 28.2ms (= 5649.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.3ms (= 8456.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.50 (= 42.3ms / 28.2ms)

OneFlow swin dataloader time: 0.271s (= 54.144s / 200, num_workers=1)
PyTorch swin dataloader time: 0.152s (= 30.335s / 200, num_workers=1)
Relative speed: 0.560 (= 0.152s / 0.271s)

OneFlow swin dataloader time: 0.073s (= 14.513s / 200, num_workers=4)
PyTorch swin dataloader time: 0.044s (= 8.760s / 200, num_workers=4)
Relative speed: 0.604 (= 0.044s / 0.073s)

OneFlow swin dataloader time: 0.040s (= 7.935s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.452s / 200, num_workers=8)
Relative speed: 0.561 (= 0.022s / 0.040s)

❌ OneFlow resnet50 time: 137.9ms (= 13792.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.0ms (= 16096.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 161.0ms / 137.9ms)

OneFlow resnet50 time: 84.4ms (= 8440.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 106.9ms (= 10686.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 106.9ms / 84.4ms)

OneFlow resnet50 time: 57.0ms (= 11392.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.9ms (= 15974.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 79.9ms / 57.0ms)

OneFlow resnet50 time: 44.1ms (= 8822.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.9ms (= 14381.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 71.9ms / 44.1ms)

OneFlow resnet50 time: 38.5ms (= 7697.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.7ms (= 14544.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.89 (= 72.7ms / 38.5ms)

github-actions[bot] avatar Aug 25 '22 11:08 github-actions[bot]

某些极端情况下显示错误和微量的内存泄漏

是什么情况下会出现呢

daquexian avatar Aug 26 '22 00:08 daquexian

某些极端情况下显示错误和微量的内存泄漏

是什么情况下会出现呢

Maybe<void> Bar() {
  UNIMPLEMENTED_THEN_RETURN();
}

Maybe<void> Foo() {
    JUST(Bar());
    return Maybe<void>::Ok();
}

constexpr auto* CachedFoo = DECORATE(&Foo, ThreadLocal);

Maybe<void> Test() {
    JUST(Foo());
    return Maybe<void>::Ok();
}

由于CachedFoo把整个Maybe都缓存起来了,在Test里,每次调用Foo所返回的都是同样的Maybe,也就是那个ErrorStack是同一个,如果Test被包在TRY里,即以TRY(Test())被上层使用,那么每次执行都会把错误栈记录到ThreadLocal所缓存的同一个ErrorStack对象里,造成内存泄漏。

lixinqi avatar Aug 26 '22 02:08 lixinqi