oneflow
oneflow copied to clipboard
fix_logsumexp_overflow_error
修复flow.logsumexp 计算结果上溢的错误:
>>> import oneflow as flow
>>> x = flow.tensor([100, 200])
>>> flow.logsumexp(x, 0)
/home/hanbinbin/oneflow/python/oneflow/framework/tensor_str.py:145: RuntimeWarning: invalid value encountered in true_divide
nonzero_finite_max / nonzero_finite_min > 1000.0
tensor(inf, dtype=oneflow.float32)
>>> flow.logsumexp(x, 0)
tensor(inf, dtype=oneflow.float32)
>>>
Static analysis with clang failed. PR label automerge has been removed
CI failed when running job: cuda-module. PR label automerge has been removed
Speed stats:
Speed stats:
Speed stats:
GPU Name: GeForce GTX 1080
❌ OneFlow resnet50 time: 139.4ms (= 13935.7ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 160.4ms (= 16039.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.15 (= 160.4ms / 139.4ms)
OneFlow resnet50 time: 84.9ms (= 8487.0ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 111.4ms (= 11135.1ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.31 (= 111.4ms / 84.9ms)
OneFlow resnet50 time: 57.8ms (= 11558.1ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 77.5ms (= 15506.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.34 (= 77.5ms / 57.8ms)
OneFlow resnet50 time: 44.8ms (= 8963.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.0ms (= 14190.4ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.58 (= 71.0ms / 44.8ms)
OneFlow resnet50 time: 40.1ms (= 8019.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 68.4ms (= 13677.5ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.71 (= 68.4ms / 40.1ms)
View latest API docs preview at: https://staging.oneflow.info/docs/Oneflow-Inc/oneflow/pr/9385/