MIOpen issues

Dropout kernel OpenCL to HIP + gtest

6

Initial attempt at translating the Dropout OpenCL Kernel to HIP with a GTEST, with hardcoded PRNG matrices replaced with rocrand function calls.

sgundabo

Implement PReLU backward

3

* Added PReLU backward operation and kernels. * Added driver test and gtest for PReLU backward operation. * New API is guarded by MIOPEN_BETA_API macro. * Compared to ROCm pytorch:...

long10024070

enhancement

external_collaborator

- Added [Fold](https://pytorch.org/docs/stable/generated/torch.nn.Fold.html) and [Unfold](https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html) op. - Full benchmark result compared to ROCm Here - Average performance: | Op | Dtype | Direction | Time | |--------|-----------|-----------|--------------| | Unfold |...

DuongQLee

enhancement

TESTING_CI_PASSED

external_collaborator

Implement NLLLoss

6

* Added [NLLLoss ](https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html) forward and backward operation and kernel. * Added driver test and gtest for NLLLoss. * New API is guarded by MIOPEN_BETA_API macro. Nllloss float16 op_name |...

hieule88

enhancement

external_collaborator

Implement MultiMarginLoss forward

9

- Add [MultiMarginLoss ](https://pytorch.org/docs/stable/generated/torch.nn.MultiMarginLoss.html) forward operation and kernel. Backward is not better compared to ROCm in general. - Given input tensor is (N,C), MIOpen is better if C is small...

littlecutebird

enhancement

external_collaborator

Implement MSELoss Function

12

This PR ports the `MSELoss` family of loss function to MIOpen: - `MSELoss` - `MSELossUnreduced` Performance measurements seems to suggest that in general we're performing better than ROCm on forward,...

o2buzzle

enhancement

external_collaborator

Backward MHA test using C++ API

2

Provides a C++ Graph API test for backward MHA. Does not execute the graph yet because of graph engine being still in development. A follow up PR will enable graph...

amberhassaan

Batchnorm Fused Inference OpenCL to HIP

1

This PR focuses on converting the Batch Norm Fused Inference kernel from OpenCL to HIP. This conversion is a part of the broader initiative to translate all OpenCL kernels within...

sgundabo

Refactor BnCKFwdInference::GetSolution for NHWC

1. Rename RunCKSolution to InitInvokerFactoryBnCKFwdInferenceNHWC to differentiate the upcoming the new API InitInvokerFactoryBnCKFwdInferenceNCHW 2. Move common code to implicitgemm_ck_util.hpp

xinlipn

miopenReduceTensor MIOPEN_REDUCE_TENSOR_AVG is failing when using f16 datatype

When trying to apply an average reduction on a tensor filled with `float16` elements, we encounter overflow issues. We configure the operation to use `float32` as the compute datatype, ensuring...

kala855

MIOpen
MIOpen copied to clipboard

Metadata

Dropout kernel OpenCL to HIP + gtest

Implement PReLU backward

Implement Fold and Unfold

Implement NLLLoss

Implement MultiMarginLoss forward

Implement MSELoss Function

Backward MHA test using C++ API

Batchnorm Fused Inference OpenCL to HIP

Refactor BnCKFwdInference::GetSolution for NHWC

miopenReduceTensor MIOPEN_REDUCE_TENSOR_AVG is failing when using f16 datatype

← Metadata

Owner

Metadata

MIOpen MIOpen copied to clipboard

Metadata

← Metadata

Owner

Metadata

MIOpen
MIOpen copied to clipboard