crew.cluster
crew.cluster copied to clipboard
Monitor classes for SLURM, PBS, and LSF
Prework
- [x] Read and agree to the Contributor Code of Conduct and contributing guidelines.
- [x] If there is already a relevant issue, whether open or closed, comment on the existing thread instead of posting a new issue.
- [x] New features take time and effort to create, and they take even more effort to maintain. So if the purpose of the feature is to resolve a struggle you are encountering personally, please consider first posting a GitHub discussion.
- [x] Format your code according to the tidyverse style guide.
Proposal
crew.cluster
0.2.0 supports a new "monitor" class to help list and terminate SGE jobs from R instead of the command line. https://wlandau.github.io/crew.cluster/index.html#monitoring shows an example using crew_monitor_sge()
:
monitor <- [crew_monitor_sge](https://wlandau.github.io/crew.cluster/reference/crew_monitor_sge.html)()
job_list <- monitor$jobs()
job_list
#> # A tibble: 2 × 9
#> job_number prio name owner state start_time queue_name jclass_name slots
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <chr>
#> 1 131853812 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
#> 2 131853813 0.05000 crew-m… USER… r 2024-01-0… all.norma… NA 1
monitor$terminate(jobs = job_list$job_number)
#> USER has registered the job 131853812 for deletion
#> USER has registered the job 131853813 for deletion
monitor$jobs()
#> data frame with 0 columns and 0 rows
Currently only SGE is supported. I would like to add other monitor classes for other clusters, but I do not have access to SLURM, PBS, or LSF. cc'ing @nviets, @brendanf, and/or @mglev1n, in case you are interested.
Hi @wlandau - just to confirm my understanding, you're proposing we add, for instance crew_monitor_slurm()
, and all related bits following crew_monitor_sge.R?
Yes, exactly! On SGE, the hardest part for me was parsing job status information. I had to dig into the XML because the non-XML output from qstat
is not machine-readable. Other than that, we would just use SLURM's commands instead of qstat
/qdel
. The R6
boilerplate should be a simple copy/paste.
The R6 boilerplate should be a simple copy/paste.
Actually, first I would like to simplify this part by creating a common abstract parent class for all the monitors to inherit from...
I'll give some thought to slurm. There are the usual slurm commands (squeue, scancel, etc...) whose output we could parse, but there's also a DB (optional and typically used in larger installations) that could be queried. Maybe the former is better at least in the short term, since not everyone will have the DB.
Thanks for looking into this! In the end I would prefer something that all/most SLURM users would be able to use.
By the way, as of 8cf036bf95be4dd0a99cab34eb43fda7fa6fda52 I created parent monitor class that all cluster-specific monitors inherit from: https://github.com/wlandau/crew.cluster/blob/main/R/crew_monitor_cluster.R. This helps reduce duplicated code/docs. The SGE monitor is much shorter now and easy to copy: https://github.com/wlandau/crew.cluster/blob/main/R/crew_monitor_sge.R. Tests are at https://github.com/wlandau/crew.cluster/blob/main/tests/testthat/test-crew_monitor_sge.R and https://github.com/wlandau/crew.cluster/blob/main/tests/sge/monitor.R.
To make sure I understand, the monitor is only for interactive use? So the data.frame which is output by jobs()
does not need to have any particular column names?
There are two options for squeue
that I am aware of: parse the standard output, which is a fixed with table (optionally the columns and widths can be specified with the -o
or -O
options if we don't trust the defaults will be the same for all users):
# this is the default format given in `man squeue`, but specify it
# in case some user's configuration is different
default_format <- "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R"
text <- system2(
"squeue",
args = shQuote(c("-u", user, "-o", default_format)),
stdout = TRUE,
stderr = if_any(private$.verbose, "", FALSE),
wait = TRUE
)
con <- textConnection(text)
out <- read.fwf(
con,
widths = c(18, -1, 9, -1, 8, -1, 8, -1, 2, -1, 10, -1, 6, -1, 100),
skip = 1,
col.names = c("JOBID", "PARTITION", "NAME", "USER", "ST", "TIME", "NODES", "NODELIST_REASON"),
strip.white = TRUE
)
tibble::as_tibble(out)
## A tibble: 7 × 8
# JOBID PARTITION NAME USER ST TIME NODES NODELIST_REASON
# <int> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#1 20504876 small crew-Opt brfurnea R 52:46 1 r18c36
#2 20504877 small crew-Opt brfurnea R 52:46 1 r18c23
#3 20504863 small crew-Opt brfurnea R 52:50 1 r18c41
#4 20504851 small crew-Opt brfurnea R 53:06 1 r18c33
#5 20504854 small crew-Opt brfurnea R 53:06 1 r18c35
#6 20504857 small crew-Opt brfurnea R 53:06 1 r18c40
#7 20504848 small OptimOTU brfurnea R 53:35 1 r18c43
The second option is slurm --yaml
, which gives a full dump of the entire queue. Arguments like -u
do nothing to filter the output, so the monitor would have to do this itself. Especially on a big cluster, this is a lot of data:
text <- system2("squeue", args = shQuote("--yaml"), stdout = TRUE, stderr = FALSE, wait = TRUE)
length(text)
# [1] 269314
This both because there are a lot of jobs, but also because it gives all possible fields, more than 100 per job.
My feeling is that option 1 is the way to go, despite the fact that fixed-width outputs may cut some values (for instance, NAME above).
That's a tough choice, and it's a shame that the more structured YAML-based is large. How large exactly, in terms of the size of the output and the execution time? I am concerned that subtle variations from cluster to cluster and odd things like spaces in job names could interfere with the standard output.
On my cluster, slurm --yaml
returned 11Mb in 0.7s. Parsing the result with yaml::read_yaml()
took about 1.4s. At the time of my test there were 2166 jobs in the queue. If it's only going to be used interactively, it's probably acceptable, but I certainly would not want to call it often in a script.
Yeah, monitor objects are just for interactive use. I think those performance metrics are not terrible as long as the documentation gives the user a heads up.
The yaml queue dump includes 111 fields for each job, some of which are themselves structured; e.g. one field is "job resources" which looks like this:
job_resources
job_resources$nodes
[1] "r15c35"
job_resources$allocated_cores
[1] 6
job_resources$allocated_hosts
[1] 1
job_resources$allocated_nodes
job_resources$allocated_nodes[[1]]
job_resources$allocated_nodes[[1]]$sockets
job_resources$allocated_nodes[[1]]$sockets$`0`
job_resources$allocated_nodes[[1]]$sockets$`0`$cores
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`6`
[1] "allocated"
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`7`
[1] "allocated"
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`8`
[1] "allocated"
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`9`
[1] "allocated"
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`10`
[1] "allocated"
job_resources$allocated_nodes[[1]]$sockets$`0`$cores$`11`
[1] "allocated"
job_resources$allocated_nodes[[1]]$nodename
[1] "r15c35"
job_resources$allocated_nodes[[1]]$cpus_used
[1] 6
job_resources$allocated_nodes[[1]]$memory_used
[1] 12288
job_resources$allocated_nodes[[1]]$memory_allocated
[1] 12288
This code approximately recreates the default squeue
output. I substituted start time for elapsed time, because the yaml does not actually include elapsed time, and I want to avoid situations where, e.g., I am using UTC while SLURM is configured to use local time or vice versa.
user <- ps::ps_username()
monitor_cols <- c("job_id", "partition", "name", "user_name", "job_state",
"start_time", "node_count", "state_reason")
text <- system2(
"squeue",
args = "--yaml",
stdout = TRUE,
#stderr = ifany(private$.verbose, "", FALSE),
wait = TRUE
)
yaml = yaml::read_yaml(text = text)
out <- map(
yaml$jobs,
~ tibble::new_tibble(
c(
map(.x[monitor_cols], ~ unlist(.x) %||% NA),
list(nodes = paste(unlist(.x$job_resources$nodes), collapse = ",") %||% NA)
)
)
)
out <- do.call(vctrs::vec_rbind, out)
out <- out[out$user_name == user,]
out$start_time <- as.POSIXct(out$start_time, origin = "1970-01-01")
out
# A tibble: 14 × 9
job_id partition name user_name job_state start_time node_count
<int> <chr> <chr> <chr> <chr> <dttm> <int>
1 20386512 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
2 20386513 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
3 20386514 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
4 20386515 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
5 20386516 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
6 20386517 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
7 20386509 longrun R_Moth… guilbaul RUNNING 2024-02-09 09:05:33 1
8 20446032 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
9 20446033 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
10 20446034 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
11 20446035 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
12 20446036 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
13 20446037 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
14 20446004 longrun R_Moth… guilbaul RUNNING 2024-02-14 09:27:25 1
# ℹ 2 more variables: state_reason <chr>, nodes <chr>
Nice! Got time for a PR?
Sorry I was pulled away from this thread by work. The yaml option looks like a much better approach than parsing squeue, but I think it requires an extra plugin and minimum slurm version. It would be worth adding a warning or something. See: Why am I getting the following error: "Unable to find plugin: serializer/json"?.
It looks like the LSF job output can similarly be parsed either using the fixed-width table, or JSON (see example below) - this would add a jsonlite
dependency:
text <- system2(
"bjobs",
args = c("-o 'user jobid job_name stat queue slots mem start_time run_time'", "-json"),
stdout = TRUE,
wait = TRUE
)
json <- jsonlite::fromJSON(text)
out <- json$RECORDS
out
user <- ps::ps_username()
text <- system2(
"bjobs",
args = c("-o 'user jobid job_name stat queue slots mem start_time run_time'", "-json"),
stdout = TRUE,
#stderr = ifany(private$.verbose, "", FALSE),
wait = TRUE
)
json <- jsonlite::fromJSON(text)
out <- json$RECORDS
out
USER JOBID JOB_NAME STAT QUEUE SLOTS MEM START_TIME RUN_TIME
1 mglevin 25900189 bash RUN voltron_interactive 1 8 Mbytes Feb 29 09:12 313 second(s)
2 mglevin 25900201 bash RUN voltron_interactive 1 2 Mbytes Feb 29 09:17 22 second(s)
3 mglevin 25665912 rstudio RUN voltron_rstudio 2 87.9 Gbytes Feb 26 15:36 236482 second(s)
Awesome! jsonlite
is super lightweight and reliable, I don't mind it as a dependency.
Would you be willing to open a PR?
just here to say hi, still early days for me with {crew} but I'm excited to learn, I have access to SLURM and PBS, and I'm reading along