Slurm_tools
Slurm_tools copied to clipboard
total gpu usage with slurmacct
Hi,
I was wondering if you have a flag to get the total cpu and gpu usage with the slurmacct tool? Goal is to get the total cpu and gpu hours per month per partition.
Thank you
Hi, I'm sorry that I don't have a good idea about getting GPU accounting information from Slurm :-( Best regards, Ole
What about the following command (at least for GPUs):
sreport -tminper cluster utilization --tres="gres/gpu" start=2023-03-01T00:00:00
Output shows something like:
--------------------------------------------------------------------------------
Cluster TRES Name Allocated Down PLND Down Idle Planned Reported
--------- -------------- ------------------ ------------------ ----------------- ------------------ ----------------- -------------------
myCluster gres/gpu 14591077(57.06%) 2282656(8.93%) 0(0.00%) 8699467(34.02%) 0(0.00%) 25573200(100.00%)
Combining CPU and GPU usage in one report may be possible but I am not sure if the numbers will be 'mixed up' too much.
The issue with the above report is that I cannot separate by partition or by node. I have wrote my own reporting tool to calculate GPU hours per node and per partition.
OleHolmNielsen, You've written some great utilities, and provided some excellent info to the slurm-users mailing list. Thanks! The one thing sreport does that slurmacct doesn't, is allow itself to be run as a non-root user, as long as the user has the admin role in the slurm db. Have you any suggestions for running slurmacct as a non-root user?
Hi, thanks for your nice comments! The slurmacct script actually uses the Slurm commands sreport and sacct to generate reports. How did you find that non-root users aren't allowed to use slurmacct? Please first make sure that the sreport and sacct commands are permitted for your non-root user.
Thanks for your reply! I saw that in the script, and was puzzled, because these guys could run sreport (and friends) without issues. The helpful message from the OS was "permission denied".
I never figured out why, but I got it working by throwing the users into the slurm group and granting rights to execute it in sudoers.
Got there the long way 'round, but at least I didn't (as one user suggested) resort to setuid! Thanks again for sharing your hard work and wisdom.