oneflow
oneflow copied to clipboard
List styled ErrorFrame
将vector形式的StackedError重构成list形式的ErrorFrame。杜绝某些极端情况下显示错误和微量的内存泄漏。
Speed stats:
GPU Name: NVIDIA GeForce GTX 1080
❌ OneFlow resnet50 time: 129.4ms (= 12940.4ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 143.7ms (= 14374.7ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.11 (= 143.7ms / 129.4ms)
OneFlow resnet50 time: 75.8ms (= 7578.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 85.8ms (= 8576.9ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 85.8ms / 75.8ms)
OneFlow resnet50 time: 48.2ms (= 9645.3ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 59.0ms (= 11798.1ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.22 (= 59.0ms / 48.2ms)
OneFlow resnet50 time: 39.2ms (= 7832.4ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.5ms (= 9100.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.16 (= 45.5ms / 39.2ms)
OneFlow resnet50 time: 32.7ms (= 6533.8ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.3ms (= 7452.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.14 (= 37.3ms / 32.7ms)
OneFlow swin dataloader time: 0.282s (= 56.410s / 200, num_workers=1)
PyTorch swin dataloader time: 0.150s (= 30.023s / 200, num_workers=1)
Relative speed: 0.532 (= 0.150s / 0.282s)
OneFlow swin dataloader time: 0.083s (= 16.541s / 200, num_workers=4)
PyTorch swin dataloader time: 0.042s (= 8.471s / 200, num_workers=4)
Relative speed: 0.512 (= 0.042s / 0.083s)
OneFlow swin dataloader time: 0.045s (= 8.993s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.486s / 200, num_workers=8)
Relative speed: 0.499 (= 0.022s / 0.045s)
❌ OneFlow resnet50 time: 145.3ms (= 14529.5ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 169.9ms (= 16991.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 169.9ms / 145.3ms)
OneFlow resnet50 time: 94.0ms (= 9395.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 112.6ms (= 11255.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.20 (= 112.6ms / 94.0ms)
OneFlow resnet50 time: 69.1ms (= 13827.5ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 95.7ms (= 19146.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.38 (= 95.7ms / 69.1ms)
OneFlow resnet50 time: 56.4ms (= 11283.7ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 75.2ms (= 15049.9ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.33 (= 75.2ms / 56.4ms)
OneFlow resnet50 time: 51.2ms (= 10233.0ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 69.7ms (= 13949.8ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.36 (= 69.7ms / 51.2ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/8674/
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 129.3ms (= 12931.5ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 140.4ms (= 14041.6ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.09 (= 140.4ms / 129.3ms)
OneFlow resnet50 time: 74.4ms (= 7441.6ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.4ms (= 8438.0ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.13 (= 84.4ms / 74.4ms)
OneFlow resnet50 time: 46.9ms (= 9373.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 55.6ms (= 11123.5ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.19 (= 55.6ms / 46.9ms)
OneFlow resnet50 time: 34.4ms (= 6871.8ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 43.9ms (= 8779.4ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.28 (= 43.9ms / 34.4ms)
OneFlow resnet50 time: 28.2ms (= 5649.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.3ms (= 8456.5ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.50 (= 42.3ms / 28.2ms)
OneFlow swin dataloader time: 0.271s (= 54.144s / 200, num_workers=1)
PyTorch swin dataloader time: 0.152s (= 30.335s / 200, num_workers=1)
Relative speed: 0.560 (= 0.152s / 0.271s)
OneFlow swin dataloader time: 0.073s (= 14.513s / 200, num_workers=4)
PyTorch swin dataloader time: 0.044s (= 8.760s / 200, num_workers=4)
Relative speed: 0.604 (= 0.044s / 0.073s)
OneFlow swin dataloader time: 0.040s (= 7.935s / 200, num_workers=8)
PyTorch swin dataloader time: 0.022s (= 4.452s / 200, num_workers=8)
Relative speed: 0.561 (= 0.022s / 0.040s)
❌ OneFlow resnet50 time: 137.9ms (= 13792.8ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 161.0ms (= 16096.1ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.17 (= 161.0ms / 137.9ms)
OneFlow resnet50 time: 84.4ms (= 8440.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 106.9ms (= 10686.5ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.27 (= 106.9ms / 84.4ms)
OneFlow resnet50 time: 57.0ms (= 11392.2ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 79.9ms (= 15974.4ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.40 (= 79.9ms / 57.0ms)
OneFlow resnet50 time: 44.1ms (= 8822.0ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.9ms (= 14381.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.63 (= 71.9ms / 44.1ms)
OneFlow resnet50 time: 38.5ms (= 7697.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 72.7ms (= 14544.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.89 (= 72.7ms / 38.5ms)
某些极端情况下显示错误和微量的内存泄漏
是什么情况下会出现呢
某些极端情况下显示错误和微量的内存泄漏
是什么情况下会出现呢
Maybe<void> Bar() {
UNIMPLEMENTED_THEN_RETURN();
}
Maybe<void> Foo() {
JUST(Bar());
return Maybe<void>::Ok();
}
constexpr auto* CachedFoo = DECORATE(&Foo, ThreadLocal);
Maybe<void> Test() {
JUST(Foo());
return Maybe<void>::Ok();
}
由于CachedFoo把整个Maybe