coldfront
coldfront copied to clipboard
Slurm plugin: Update default slurm account
If a user is under multiple accounts and you try to remove their access that is also their default slurm account, it will fail. You have to modify the user to change the default account, then remove the access to the other account. If this isn't possible to automate, then it needs to be logged and viewable for the sys admin to handle before re-running the slurm sync tool. https://github.com/ubccr/coldfront/issues/273
A user has associations under more than one account:
sacctmgr show user ccrgst72 -s list format=user,defaultaccount,account,cluster,qos%45
User Def Acct Account Cluster QOS
---------- ---------- ---------- ---------- ---------------------------------------------
ccrgst72 ccrgsttest ccrgst72 alpha debug,general-compute,normal,scavenger,viz
ccrgst72 ccrgsttest ccrgsttest alpha debug,general-compute,normal,scavenger,viz
When that user is removed from the allocation that is the default slurm account we get an error:
COLDFRONT_ENV=.env coldfront slurm_check -c alpha -a ccrgsttest -s -x
Syncing Slurm with ColdFront
username account cluster slurm_action slurm_specs
Slurm command failed: /util/software/ubuntu/slurm/current/bin/sacctmgr -Q -i delete user where name=ccrgst72 cluster=alpha account=ccrgsttest
Failed removing Slurm association user ccrgst72 account ccrgsttest cluster alpha: return_value=1 stdout=b' Deleting user associations...\n C = alpha A = ccrgsttest U = ccrgst72 \n' stderr=b' User ccrgst72 on cluster alpha no longer has a default account.\n You must change the default account of these users or remove the users completely from the affected clusters to allow these changes.\n Changes Discarded\n'
ccrgst72 ccrgsttest alpha Remove
Slurm command failed: /util/software/ubuntu/slurm/current/bin/sacctmgr -Q -i delete account where name=ccrgsttest cluster=alpha
Failed removing Slurm account ccrgsttest cluster alpha: return_value=1 stdout=b'' stderr=b" Users listed below have these as their Default Accounts.\nC = alpha A = ccrgsttest U = ccrgst72 \n Please either remove the accounts listed above from list and resubmit,\n or change these users' default accounts to remove the account(s).\n Changes Discarded\n"
ccrgsttest alpha Remove
Currently, I manually modify the default account with:
sacctmgr modify user <username> set DefaultAccount=<account>
Then re-run the above slurm_check command.
Sometimes this can get even more complicated if they have different default accounts for different clusters. You'd then have to specify:
sacctmgr modify user <username> set DefaultAccount=<account> where cluster=<cluster>
Fixed in #597