MIOpen
MIOpen copied to clipboard
AMD's Machine Intelligence Library
- Add L1Loss operation with forward reduced kernels. - Add driver and gtest for kernels. - MIOpen performs better if: - Reduction mode is either sum or mean ### Average...
- Add IndexSelect operation with forward and backward kernels. - Add driver and gtest for kernels. - MIOpen performs better if: - Number of output elements is less than 100000...
This PR is replacing `std::vector` with `InlineVector` in `TensorDescriptor` for lens/strides. `InlineVector` https://github.com/ROCm/MIOpen/pull/3419 I did some performance testing to compare `InlineVector` with `std::vector`(initialization, elements access etc). The results are that...
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6. Release notes Sourced from jinja2's releases. 3.1.6 This is the Jinja 3.1.6 security release, which fixes security issues but does not otherwise change behavior...
Hello, We have a good amount of existing OpenCL code that is not trivial to port to HIP. I understand the OpenCL backend is deprecated, but there's an argument for...
Helping with https://github.com/ROCm/MIOpen/pull/3370 - Merged develop into Reid's branch - Refactored the easier outliers - Made keyword detection more robust by requiring word boundary
Revert https://github.com/ROCm/MIOpen/pull/3362 after finding and fixing the corresponding bug/defect. CC @junliume @bpepers-me
MIOpen with HEAD at mainline fails to compile with `MIOPEN_ENABLE_SQLITE_KERN_CACHE` set to `OFF`. This patch attempts to fix this (build passes, not further tested).
- Detail of [operation](https://www.tensorflow.org/api_docs/python/tf/raw_ops/GatherV2) (tensorflow) - Add GatherV2 operation with non-batched and batched backward kernels. - Add driver and gtest for kernels. ### Average improvement over ROCm | type |...
- Add CosineSimilarity operation with forward and backward kernels. - Add driver and gtest for kernels. - MIOpen performs better if: - Number of output elements exceeds 20000 ### Average...