Alex Eremin comments

Results 46 comments of


                                            Alex Eremin

[CONV] fix naive conv kernel for large tensors

> > We do have https://github.com/ROCm/MIOpen/blob/develop/test/gpu_reference_kernel.cpp Yes, that's a naive CPU single threaded ultra slow verification for naive GPU algorithm. That test is not about "huge" tensors, it has exactly...

[CONV] fix naive conv kernel for large tensors

> Yes, we do need to do the slow cpu run. I can the test a nightly run. > I'm not sure that we do need. It depends on the...

[DRAFT] Fix op tensor nhwc

Tensor operations are layout agnostic, the most important limitation is all the tensors must have the same layout. Another limitation from the current implementation - all the kernels, even the...

[DRAFT] Fix op tensor nhwc

This PR decided to be closed, because there is another approach to fix that problem.

Impl pad reflection

Is it https://github.com/ROCm/MIOpen/labels/external_collaborator / https://github.com/ROCm/MIOpen/labels/enhancement ? Could you also add an appropriate description?

Batchnorm Fused Inference OpenCL to HIP

It can be safely merged since it does not affect production code. I just don't want to lose this PR.

Batchnorm Fused Inference OpenCL to HIP

That PR is also important, since it provides centralized definition for a set of activation functions (for example for #3247 where it has got a local definition for sigmoid https://github.com/ROCm/MIOpen/pull/3247/files#diff-2a117e014b2a1c04feb3ede9723a78a8d11d656b4e9631fa57ab1d7c58df55d6)...

Alex Eremin

[CONV] fix naive conv kernel for large tensors

[CONV] fix naive conv kernel for large tensors

[DRAFT] Fix op tensor nhwc

[DRAFT] Fix op tensor nhwc

Impl pad reflection

Batchnorm Fused Inference OpenCL to HIP

Batchnorm Fused Inference OpenCL to HIP

Dropout kernel OpenCL to HIP + gtest

Dropout kernel OpenCL to HIP + gtest

Implement MSELoss Function