Mark Gates
Mark Gates
Nice utility. This fixes 2 minor issues. See commits for details.
Tested with ROCm 5.2. ``` > hipcc --version HIP version: 5.2.21151-afdc89f8 AMD clang version 14.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.2.0 22204 50d6d5d5b608d2abd6af44314abc6ad20036af3b) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm-5.2.0/llvm/bin ``` Reproducer: ``` //...
**Description** Per LAPACK docs, in [sd]tgexc, both `ifst` and `ilst` are [in,out]: while in [cz]tgexc, `ifst` is [in] and `ilst` is [in,out]: https://github.com/Reference-LAPACK/lapack/blob/master/SRC/stgexc.f#L142-L147 https://github.com/Reference-LAPACK/lapack/blob/master/SRC/ctgexc.f#L133-L138 In LAPACKE_[sd]tgexc, both `ifst` and `ilst`...
**Description** The workspace query & docs for `tgsen` appear to be off by 1. It claims to need `2*M*(N-M)`, but then calls `tgsyl` with `LWORK-2*N1*N2` where N1 = M, N2...
1. For slightly tall (m > n and not m >> n) matrices, the internal U and VT matrices were incorrectly sized. Uhat and U1 are m x n, U2...
1. The `tile` namespace was previously added to `Tile_blas.hh`. This adds it to `Tile_lapack.hh` and similar headers. 2. Many files needlessly included `Tile_blas.hh` and similar headers. If a file doesn't...
This is a loose collection of TODO items. - [ ] Standardize order of layout, priority, queue. See PR #151 for comments. - [ ] Rename tbsmPivots => tbsm_pivots -...
Depends on https://github.com/icl-utk-edu/testsweeper/pull/21 and https://github.com/icl-utk-edu/blaspp/pull/77 and https://github.com/icl-utk-edu/lapackpp/pull/57 Builds on #164, method enums. Use standard `to_string` function to convert enum to string. Introduce `to_char` and `to_c_string` functions. Introduce `from_string` function to...
Also preparing for `to_string`, etc. in next PR.
The `gpu_bind.sh` script avoids oversubscribing GPUs, which can be detrimental, e.g. on a DGX, 4 ranks, each with 8 GPUs! However, it no longer tests the multi-GPU per MPI rank...