MIOpen
MIOpen copied to clipboard
AMD's Machine Intelligence Library
Another byproduct of #3181 [LastTest.log](https://github.com/user-attachments/files/16567842/LastTest.log) The error message: ``` /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/shared_ptr_base.h:199:9: runtime error: member call on address 0x00000b9e6590 which does not point to an object of type 'std::_Sp_counted_base' 0x00000b9e6590: note: object...
Byproduct of https://github.com/ROCm/MIOpen/pull/3181 [CI log (access may be restricted)](http://micimaster.amd.com/blue/rest/organizations/jenkins/pipelines/MLLIBS/pipelines/MIOpen/branches/update_ck_0805/runs/8/nodes/292/log/?start=0) What worries us is the line: ``` MIOpen(HIP): Warning [BuildHip] /tmp/comgr-8efef0/input/static_kernel_gridwise_convolution_backward_data_implicit_gemm_v4r1_nchw_kcyx_nkhw.cpp:6:60: error: failed to meet occupancy target given by 'amdgpu-waves-per-eu' in...
2D and 3D NGCHW backward data convolution solvers base on CK instances.
This PR resolves issues related to building MIOpen with [spack](https://github.com/spack/spack). In Spack, all of these packages are installed in different paths, not is a single `/opt/rocm` path, which is why...
Implement 2d and 3d NGCHW backward convolution solvers base on CK NGCHW layout instances.
Move test_perf.py to prevent overlap in rocm bin dir. cmake flag for installing test_perf.py https://ontrack-internal.amd.com/browse/SWDEV-522566
Implement 2d and 3d NGCHW forward convolution solvers base on CK NGCHW layout instances. The solver utilize the fused CK convolution and layout transform kernels to avoid the tensor layout...
Now the priorities for find enforce are: 1. FindOptions 2. TuningPolicy (this is the new thing) 3. MIOPEN_FIND_ENFORCE value 4. defaulting to FindEnforceAction::None This PR introduces **TuningPolicy (this is the...
enable 3d grouped fwd for gfx1100 with the following test, there is 100x uplift for Conv3d fwd on gfx1100: ``` import torch import torch.nn as nn import time device =...