Ole Schütt
Ole Schütt
@hfp, is this with the Intel compiler? With GCC I have not seen any crashes myself.
> At least for the miniapp, relying on larger arrays/allocations on the stack proofs unreliable. Yes, you might have to move [some](https://github.com/cp2k/cp2k/blob/fedce27c3a36d9031d1dc52189396fabd5aa852b/src/dbm/dbm_miniapp.c#L32) of the [arrays](https://github.com/cp2k/cp2k/blob/fedce27c3a36d9031d1dc52189396fabd5aa852b/src/dbm/dbm_miniapp.c#L72) from the stack to the...
Yes, the Miniapp uses 18x18 blocks, which is not representative for the actual applications. Unfortunately, DBM does not yet collect statistics. Generally, it's not uncommon that one of the block...
I collected the block size stats for [RI-HFX_H2O-32.inp](https://github.com/cp2k/cp2k/blob/master/benchmarks/QS_single_node/RI-HFX_H2O-32.inp) in a quick-and-dirty way. As expected, it's very diverse: Click to expand! ``` # m x n x k count 36 x...
@mkrack since you raised this issues during yesterday's meeting, I took another look. Unfortunately, I still don't see a way to make progress. The symptoms are quite confusing and without...
I've talked to an Nvidia engineer: The NCCL restriction of one MPI rank per GPU won't go away anytime soon. I've also learned that it's difficult to achieve good performance...
>> it's difficult to achieve good performance with GPU-aware MPI, because it does not support streams and therefore requires additional host/device synchronizations. > Specifically for COSMA case, this is not...
While I don't know the exact parameters that triggered this problem, I doubt that empty ranks are the cause. When there are not enough columns for every rank then we...
Yes, we create a [new blacs grid](https://github.com/cp2k/cp2k/blob/master/src/fm/cp_fm_diag_utils.F#L386) and a [new elpa object](https://github.com/cp2k/cp2k/blob/master/src/fm/cp_fm_elpa.F#L433) for each invocation. Since all of this has been in place since November 2019, I suspect the problem...
> I do not see a call to descinit and especially to check it's "info" return value The call to `descinit` is somewhat hidden, but it does check the returned...