Omri Mor

Results 43 comments of Omri Mor

The website is still down, unfortunately. Any updates to share?

I am encountering the same issue while running a tile low-rank Cholesky where the tiles are smaller than we've used before, resulting in many more tasks and more communication messages—though,...

I think I've been experiencing a similar issue, but compounded with the fact that in my use-case, it appears to not be precisely replicable—whether it occurs appears to be stochastic....

Changing the `pstlvars` scripts would be necessary as well for this solution, of course.

vcpkg is doing [something similar](https://github.com/microsoft/vcpkg/blob/020923a98dac40b55098170ae3dcb65a4eab58b5/ports/parallelstl/fix-cmakelist.patch), which is a further signal that this may be the right approach (and that an upstream fix is necessary).

LLVM upstream seems to handle this differently. The pstl stdlib headers are in the `include` directory, but prefixed with `__pstl_` (e.g. `include/__pstl_algorithm`). The standard headers in the libcxx project will...

I have also gotten other UCX errors: ``` [exp-8-18:3546090] COPY-OPAL-VALUE: UNSUPPORTED TYPE 0 [exp-8-18:3546090] OPAL ERROR: Error in file base/pmix_base_hash.c at line 256 [1660858547.102648] [exp-8-18:3546090:0] address.c:877 UCX ERROR failed to...

> * Is this a regression (did you run the same task with older OMPI/UCX)? I have not yet attempted to replicate this particular issue with older OMPI/UCX. > *...

Some additional context is that this appears to be related to issues in dynamically connecting processes; setting the `mpi_preconnect_mpi` MCA option avoids these issues. According to @bosilca there have been...

I'm using UCX v1.13.0 and Open MPI 4.1.4 and have still seen the issue; I suspect it's down to application-specific behavior.