Rostyslav Geyyer
Rostyslav Geyyer
- Add element op - Add instances - Add example - Add client example
- Refactor f8_t and bf8_t - Update conversion methods - Update load method - Add dynamic buffer custom types support - Update threadwise conversion Right now custom types are supported...
- add vectorization support for custom types defined as non-native types or user-defined objects - add a test to check if native and non-native vector types take same expected space...
- Add a naive gpu gemm reference kernel - Switch gemm examples to this verification path
We have discovered flaky fp8/bf8 tests failing on ROCm 6.1 and newer with rounding to nearest / even. Disabling this test https://github.com/ROCm/composable_kernel/pull/1495 until compiler patch becomes available. @illsilin @junliume
## Proposed changes This is a temporary PR with a repro example ## Checklist Please put an `x` into the boxes that apply. You can also fill these out after...