Mario Lezcano Casado comments

Results 237 comments of


                                            Mario Lezcano Casado

"Intel MKL ERROR: Parameter 8 was incorrect on entry to SSYEVD." for matrix of [53000,53000]

We would accept a fix following https://github.com/pytorch/pytorch/issues/92141#issuecomment-1382241971

Replace log(1 + x) with log1p(x)

What about the last two files? Also, cc @fritzo for confirmation on whether we can use `log1p` on distributions. I would say it'd be very much desirable to use it...

[CUDA12] Conditionally set device in device guard

This is the only PR left to close https://github.com/pytorch/pytorch/issues/91122. PTAL @ngimel

Solving the under/overflow for complex division

That path is vectorised, so it used vectorised CPU operations. Have a look at how they are implemented within `aten/src/ATen/cpu/vec`. Fixing those while keeping a not-too-bad performance is going to...

Solving the under/overflow for complex division

You'd still need to fix the AVX2 and AVX512 implementations of div accordingly. And sure, you can use hypot there, that may be faster. When you do that, it'd still...

Solving the under/overflow for complex division

At any rate, I'd still suggest first merging this PR, and then fixing the vectorised path on a follow-up PR. If what you are proposing is to implement `div` as...

Solving the under/overflow for complex division

This PR already fixed some operations, so you can remove the xfails on those! Also, it seems to be failing in Windows, so we need to fix those.

Solving the under/overflow for complex division

To see the different tests failures, see https://hud.pytorch.org/pr/92539 and select the commit you'd want to inspect.

Solving the under/overflow for complex division

mac builds are complaining as well. It looks like this is a "non-standard extension" of the standard, as [`std::abs` is not defined to be constexpr](https://en.cppreference.com/w/cpp/numeric/math/fabs) (ugh). Could you check whether...

Solving the under/overflow for complex division

I'd keep the `std::abs` for gcc as it's the most used compiler for PyTorch and it generates the fastest intel instruction. Otherwise, yep, let's fall back to the other implementation.