top500-benchmark Intel Granite Rapids (Xeon 6980P 128-Core / 256-Thread) - HYDU_create

Attempting to run Top500 Playbook on a 2P Granite Rapids server. Single server test.

2x Intel Xeon 6980P 128-Core / 256-Thread

Benchmark errors out with "HYDU_create_process (lib/utils/launch.c:24): pipe error (Too many open files)"

I have attempted to workaround the issue by allowing more active processes to run on the host (ulimit -n 4096). Sometimes that will result this same error... other times, the script will hang at "TASK [Run the benchmark]", but never progress.

TASK [Run the benchmark.] *****************************************************************
fatal: [127.0.0.1]: FAILED! => changed=true
cmd:
- mpirun
- -f
- cluster-hosts
- ./xhpl
delta: '0:00:00.083887'
end: '2024-11-01 14:19:52.600072'
msg: non-zero return code
rc: 255
start: '2024-11-01 14:19:52.516185'
stderr: |-
[proxy:0@craft-6900P] HYDU_create_process (lib/utils/launch.c:24): pipe error (Too many open files)
[proxy:0@craft-6900P] launch_procs (proxy/pmip_cb.c:1003): create process returned error
[proxy:0@craft-6900P] handle_launch_procs (proxy/pmip_cb.c:588): launch_procs returned error
[proxy:0@craft-6900P] HYD_pmcd_pmip_control_cmd_cb (proxy/pmip_cb.c:498): launch_procs returned error
[proxy:0@craft-6900P] HYDT_dmxu_poll_wait_for_event (lib/tools/demux/demux_poll.c:76): callback returned error status
[proxy:0@craft-6900P] main (proxy/pmip.c:122): demux engine error waiting for event
[mpiexec@craft-6900P] control_cb (mpiexec/pmiserv_cb.c:280): assert (!closed) failed
[mpiexec@craft-6900P] HYDT_dmxu_poll_wait_for_event (lib/tools/demux/demux_poll.c:76): callback returned error status
[mpiexec@craft-6900P] HYD_pmci_wait_for_completion (mpiexec/pmiserv_pmci.c:173): error waiting for event
[mpiexec@craft-6900P] main (mpiexec/mpiexec.c:260): process manager error waiting for completion
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>

NO MORE HOSTS LEFT ************************************************************************

PLAY RECAP ********************************************************************************
127.0.0.1 : ok=22 changed=4 unreachable=0 failed=1 skipped=7 rescued=0 ignored=0

Nov 01 '24 21:11 CraftComputing

This sounds a lot like an MPI hostname issue — for your hosts.ini file, do you have it set like the example?

# For single node benchmarking (default), use this:
[cluster]
127.0.0.1 ansible_connection=local

And when you run the benchmark, did you just run ansible-playbook main.yml --tags "setup,benchmark"? If you don't have the tags, it might've tried to set up clustering... which could make things act a little funny!

Nov 01 '24 22:11 geerlingguy

Worst case, though, you can nuke the build folder (rm -rf /opt/top500) and try again.

Usually if it hits run the benchmark and nothing is happening, it means MPI is trying to fire off the process, but for some reason is not able to.

Oh one more thing, just for context — is it running on Ubuntu or some other distro?

Nov 01 '24 22:11 geerlingguy

Yes, I am running ansible-playbook main.yml --tags "setup,benchmark"

Hosts file is setup for local [127.0.0.1] as well.

I've tried Ps + Qs configuration as default [1/4], also [1/256] and [2/128] with no effect.

I nuked the /opt/top500 folder and ran again. Same result.

Nov 01 '24 23:11 CraftComputing

@CraftComputing - Can you try running the benchmark manually?

cd /opt/top500/tmp/hpl-2.3/bin/top500
mpirun -f cluster-hosts ./xhpl

I wonder if there's some output Ansible's eating up that may be helpful debugging this. Also for completeness, can you post the contents of the cluster-hosts file in that directory, as well as the HPL.dat (the latter just for reference, for your system the tuning might be better with a more even Ps/Qs).

cat /opt/top500/tmp/hpl-2.3/bin/top500/cluster-hosts
cat /opt/top500/tmp/hpl-2.3/bin/top500/HPL.dat

Nov 02 '24 02:11 geerlingguy

hosts.ini:

# For single node benchmarking (default), use this:
[cluster]
127.0.0.1 ansible_connection=local

mpirun -f cluster-hosts ./xhpl:

[mpiexec@craft-6900P] HYDU_parse_hostfile (lib/utils/args.c:324): unable to open host file: cluster-hosts
[mpiexec@craft-6900P] mfile_fn (mpiexec/options.c:315): error parsing hostfile
[mpiexec@craft-6900P] match_arg (lib/utils/args.c:159): match handler returned error
[mpiexec@craft-6900P] HYDU_parse_array (lib/utils/args.c:181): argument matching returned error
[mpiexec@craft-6900P] parse_args (mpiexec/get_parameters.c:313): error parsing input array
[mpiexec@craft-6900P] HYD_uii_mpx_get_parameters (mpiexec/get_parameters.c:48): unable to parse user arguments
[mpiexec@craft-6900P] main (mpiexec/mpiexec.c:54): error parsing parameters

cat /opt/top500/tmp/hpl-2.3/bin/top500/cluster-hosts

10.0.0.179:512

cat /opt/top500/tmp/hpl-2.3/bin/top500/HPL.dat:

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
350963         Ns
1            # of NBs
256           NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1            Ps
256            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0                               Number of additional problem sizes for PTRANS
1200 10000 30000                values of N
0                               number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64        values of NB

Nov 02 '24 02:11 CraftComputing

@CraftComputing - Thanks! I'm going to boot my Ampere machine and double check a couple things. I think it may be what's in the cluster-hosts, the fun thing is this could be related to DNS too ;)

Can you check the contents of your /etc/hosts file on that machine too? I'm going to compare cluster-hosts and my /etc/hosts on a known working machine. Something may have gotten lost in translation.

Nov 02 '24 02:11 geerlingguy

For comparison, my files:

# Inside /opt/top500/tmp/hpl-2.3/bin/top500/cluster-hosts
10.0.2.21:192

# Inside /etc/hosts
127.0.0.1 localhost
127.0.1.1 ubuntu

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

And I can confirm I can ping my system's mDNS name OR local IP and get a result:

ubuntu@ubuntu:~$ ping 10.0.2.21
PING 10.0.2.21 (10.0.2.21) 56(84) bytes of data.
64 bytes from 10.0.2.21: icmp_seq=1 ttl=64 time=0.042 ms
64 bytes from 10.0.2.21: icmp_seq=2 ttl=64 time=0.009 ms
^C
--- 10.0.2.21 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.009/0.025/0.042/0.016 ms

ubuntu@ubuntu:~$ ping ubuntu
PING ubuntu (127.0.1.1) 56(84) bytes of data.
64 bytes from ubuntu (127.0.1.1): icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from ubuntu (127.0.1.1): icmp_seq=2 ttl=64 time=0.005 ms
^C
--- ubuntu ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1037ms
rtt min/avg/max/mdev = 0.005/0.026/0.047/0.021 ms

Can you confirm the same on your system? I wonder if you might have a network setup that is causing mpich to be angry :(

Nov 02 '24 02:11 geerlingguy

Hosts file... 10.0.0.179 is my local IP address

127.0.0.1 localhost
127.0.1.1 craft-6900P

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
# BEGIN Ansible MPI host 127.0.0.1
10.0.0.179 127.0.0.1 127.0.0.1
# END Ansible MPI host 127.0.0.1

I can ping both the local IP 10.0.0.179 and mDNS record craft-6900p from the localhost. The network is a flat LAN, 10.0.0.0/24.

Nov 02 '24 02:11 CraftComputing

Another idea, since you have Hyperthreading... can you modify /opt/top500/tmp/hpl-2.3/bin/top500/cluster-hosts and change the 512 to 256 and try again running the manual command?

Nov 02 '24 02:11 geerlingguy

Changed 512 to 256, and still hanging at the same spot.

TASK [Run the benchmark.] ******************************************************************************************************************************************************************************************
task path: /home/craft/top500-benchmark/main.yml:214
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: craft
<127.0.0.1> EXEC /bin/sh -c 'echo ~craft && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/craft/.ansible/tmp `"&& mkdir "` echo /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751 `" && echo ansible-tmp-1730516294.043043-9707-136336910001751="` echo /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751 `" ) && sleep 0'
Using module file /usr/lib/python3/dist-packages/ansible/modules/command.py
<127.0.0.1> PUT /home/craft/.ansible/tmp/ansible-local-8737msev_9aq/tmpi3woq2tr TO /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751/AnsiballZ_command.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751/ /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751/AnsiballZ_command.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python3 /home/craft/.ansible/tmp/ansible-tmp-1730516294.043043-9707-136336910001751/AnsiballZ_command.py && sleep 0'

Nov 02 '24 02:11 CraftComputing

It sounds like my script needs a little updating, specifically the hostvars[host].ansible_processor_vcpus is incorporating threads, not cores, resulting in the errant 512 parameter in your cluster-hosts file.

https://github.com/geerlingguy/top500-benchmark/blob/53ae3f35f2cb1c7f00f9b948ceae082fadf01560/templates/mpi-node-config.j2#L2

I'll look at a better option for that. (Or maybe have it switch depending on architecture?)

Separately, since you mentioned (separately) that switching the count from 512 to 256 got HPL running... once you get a full run in, we might be able to tweak the blis build for your particular architecture better, depending on if there's a better config (see all the blis configs).

On the AmpereOne, tweaking that made almost a 40% improvement, but it's architecture is vastly different than the generic arm64 that the automatic configuration picks out.

Nov 02 '24 03:11 geerlingguy

Opened a follow-up issue: https://github.com/geerlingguy/top500-benchmark/issues/46

For now, we can just act like Ansible doesn't exist anymore and run the command manually :)

Nov 02 '24 03:11 geerlingguy

This issue has been marked 'stale' due to lack of recent activity. If there is no further activity, the issue will be closed in another 30 days. Thank you for your contribution!

Please read this blog post to see the reasons why I mark issues as stale.

Mar 06 '25 06:03 github-actions[bot]

This issue has been closed due to inactivity. If you feel this is in error, please reopen the issue or file a new issue with the relevant details.

May 08 '25 06:05 github-actions[bot]

Intel Granite Rapids (Xeon 6980P 128-Core / 256-Thread) - HYDU_create_process (Too many open files)