MIOpen
MIOpen copied to clipboard
AMD's Machine Intelligence Library
In function `verify_bacward_data_lstm::gpu()` we seemingly inadvertently rely on the workspace being `zeroed out`. We create a `std::vector` for workspace just to create a gpu buffer `workspace_dev` with `handle.Write()`. Creating this...
The shell script has dependency on the binary, MIOpenDriver
Based on https://github.com/ROCmSoftwarePlatform/MIOpen/pull/2379#discussion_r1323596346 Currently `ConvSolver` initialized with `miopenStatusSuccess` by default, but it's not actually initialized with meaningful values: ```CPP ConvSolution(miopenStatus_t status_ = miopenStatusSuccess) : status(status_), solver_id(""), invoker_factory(boost::none), workspace_sz(0), grp_tile1(-1), grp_tile0(-1),...
- https://github.com/ROCmSoftwarePlatform/MIOpen/labels/urgency_high https://github.com/ROCmSoftwarePlatform/MIOpen/labels/request_for_comments Finally decide how to enable experimental API features - [ ] https://github.com/ROCmSoftwarePlatform/MIOpen/pull/2320#discussion_r1316419544 - https://github.com/ROCmSoftwarePlatform/MIOpen/labels/urgency_blocker Let's enable and properly test the experimental feature everywhere. - [x] https://github.com/ROCmSoftwarePlatform/MIOpen/pull/2320#discussion_r1326549529 -...
Split Resnet50 file by data type and batchsize to parallelize the CI stage. Added try_catch for the k_time comparison to let CI pass when a stage fails (to be removed...
Hi, While running an LSTM-based model using rocm I get the following error, while with CUDA on NVIDIA GPU, it works fine. I checked the size of the tensor is...
When adapting [Mask2Former](https://github.com/facebookresearch/Mask2Former) to Pytorch-ROCm, I am facing a `MIOpen Error: /.../data/driver/MLOpen/src/sqlite_db.cpp:209: Internal error while accessing SQLite database: locking protocol`. Python: 3.8.16 GPU: GFX90 PyTorch is installed by: `pip install...
The following test case is not enabled on gfx11x and gfx94x asics since the winograd kernel is disabled for such a config. Currently, I am creating a workaround to keep...
Attention mechanisms are widely used in deep learning models, particularly in large language models. And a flexible attention kernel can help users to build accelerated language models conveniently on AMD...
This issue happens when MIopen wants to use HIPCC_RTC and standard libraries like **#include limits** at the same time. In general hipRTC does not allow the inclusion of std headers,...