MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

AMD's Machine Intelligence Library

Results 298 MIOpen issues
Sort by recently updated
recently updated
newest added

Another byproduct of #3181 [LastTest.log](https://github.com/user-attachments/files/16567842/LastTest.log) The error message: ``` /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/shared_ptr_base.h:199:9: runtime error: member call on address 0x00000b9e6590 which does not point to an object of type 'std::_Sp_counted_base' 0x00000b9e6590: note: object...

urgency_blocker

Byproduct of https://github.com/ROCm/MIOpen/pull/3181 [CI log (access may be restricted)](http://micimaster.amd.com/blue/rest/organizations/jenkins/pipelines/MLLIBS/pipelines/MIOpen/branches/update_ck_0805/runs/8/nodes/292/log/?start=0) What worries us is the line: ``` MIOpen(HIP): Warning [BuildHip] /tmp/comgr-8efef0/input/static_kernel_gridwise_convolution_backward_data_implicit_gemm_v4r1_nchw_kcyx_nkhw.cpp:6:60: error: failed to meet occupancy target given by 'amdgpu-waves-per-eu' in...

GTest

2D and 3D NGCHW backward data convolution solvers base on CK instances.

This PR resolves issues related to building MIOpen with [spack](https://github.com/spack/spack). In Spack, all of these packages are installed in different paths, not is a single `/opt/rocm` path, which is why...

Implement 2d and 3d NGCHW backward convolution solvers base on CK NGCHW layout instances.

Move test_perf.py to prevent overlap in rocm bin dir. cmake flag for installing test_perf.py https://ontrack-internal.amd.com/browse/SWDEV-522566

TESTING_CI_PASSED

Implement 2d and 3d NGCHW forward convolution solvers base on CK NGCHW layout instances. The solver utilize the fused CK convolution and layout transform kernels to avoid the tensor layout...

Now the priorities for find enforce are: 1. FindOptions 2. TuningPolicy (this is the new thing) 3. MIOPEN_FIND_ENFORCE value 4. defaulting to FindEnforceAction::None This PR introduces **TuningPolicy (this is the...

enable 3d grouped fwd for gfx1100 with the following test, there is 100x uplift for Conv3d fwd on gfx1100: ``` import torch import torch.nn as nn import time device =...