Ben Menadue
Ben Menadue
A user on our cluster has hit a very similar issue with v4.1.3, but using `MPI_Win_allocate` and `MPI_Fetch_and_op`, and only when the two ranks have different sized windows (everything but...
@BenWibking Hmm, that's unexpected, thanks for letting us know. When we originally tested our OS we couldn't see much in the way of "OS-level noise". I'm wondering it one of...
For reference, so far today (it's currently 12:20pm) Loi has made 520k requests to our PBS Pro server.
From our end, we're seeing an SSH session on a login node that's running `qstat -x -f` periodically against a number of jobs. While each individual job is only being...
Even better again, you could use PBS's ability to send e-mails on state change (e.g. start, end, abort) to become properly event-driven and avoid polling altogether.
I'm not sure I understand your reasons for fast polling, sorry. How does missing a state change break your pipeline? If attempting to stat a job fails because it has...
@bosilca I'm in two minds about that. Yes, delegating up might make it cleaner, but it's really a component-specific property (e.g. other shmem components don't need a filename, and the...
I've rebased the changes onto current master.
PIDs aren't unique when you consider the full uptime of a system - they will eventually be reused (e.g. the login nodes of our cluster fairly regularly rollover the pid...
Yes, that would be a much better approach. I didn't realise there was already a unique name being passed down from the upper layers. Just need to keep in mind...