nextflow
nextflow copied to clipboard
pbspro: ignore resource directives if -l option is specified
Closes #2264
Alternative solution to #2971
This PR changes the pbspro executor to ignore cpus
and memory
directives if clusterOptions
contains -l
. That way, you can do something like:
clusterOptions = { "-l select=1:ncpus=${task.cpus}:mem=${task.memory.toMega()}mb:..." }
Currently this doesn't work because Nextflow generates two -l
lines which is invalid. With this PR, Nextflow will only generate the one -l
line from clusterOptions
.
Could be useful for the other grid executors as well.
Probably should apply this change to all of the grid executors at once so that clusterOptions is consistent. I will review the other executors to see which ones should be updated.
Upon review, none of the other grid executors seems to have this restriction of one resource list, so I think we should go ahead and merge this one.
pkg:maven/io.nextflow/[email protected]
1 Critical, 9 Severe, 0 Moderate, 0 Unknown vulnerabilities have been found across 1 dependencies
Components
pkg:maven/io.nextflow/[email protected]
CRITICAL Vulnerabilities (1)
Uncontrolled Resource Consumption
moment is a JavaScript date library for parsing, validating, manipulating, and formatting dates. Affected versions of moment were found to use an inefficient parsing algorithm. Specifically using string-to-date parsing in moment (more specifically rfc2822 parsing, which is tried by default) has quadratic (N^2) complexity on specific inputs. Users may notice a noticeable slowdown is observed with inputs above 10k characters. Users who pass user-provided strings without sanity length checks to moment constructor are vulnerable to (Re)DoS attacks. The problem is patched in 2.29.4, the patch can be applied to all affected versions with minimal tweaking. Users are advised to upgrade. Users unable to upgrade should consider limiting date lengths accepted from user input.
CVSS Score: 7.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
CWE: CWE-400
SEVERE Vulnerabilities (9)
sonatype-2021-0900
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N
CWE: CWE-79
CVE-2018-14040
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the collapse data-parent attribute.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2018-14041
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the data-target property of scrollspy.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2018-14042
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the data-container property of tooltip.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2019-11358
Improperly Controlled Modification of Object Prototype Attributes ('Prototype Pollution')
jQuery before 3.4.0, as used in Drupal, Backdrop CMS, and other products, mishandles jQuery.extend(true, {}, ...) because of Object.prototype pollution. If an unsanitized source object contained an enumerable proto property, it could extend the native Object.prototype.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-1321
CVE-2019-8331
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 3.4.1 and 4.3.x before 4.3.1, XSS is possible in the tooltip or popover data-template attribute.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2020-11023
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In jQuery versions greater than or equal to 1.0.3 and before 3.5.0, passing HTML containing <option> elements from untrusted sources - even after sanitizing it - to one of jQuery's DOM manipulation methods (i.e. .html(), .append(), and others) may execute untrusted code. This problem is patched in jQuery 3.5.0.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
sonatype-2018-0607
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
sonatype-2020-0187
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
pkg:maven/io.nextflow/[email protected]
1 Critical, 9 Severe, 0 Moderate, 0 Unknown vulnerabilities have been found across 1 dependencies
Components
pkg:maven/io.nextflow/[email protected]
CRITICAL Vulnerabilities (1)
Uncontrolled Resource Consumption
moment is a JavaScript date library for parsing, validating, manipulating, and formatting dates. Affected versions of moment were found to use an inefficient parsing algorithm. Specifically using string-to-date parsing in moment (more specifically rfc2822 parsing, which is tried by default) has quadratic (N^2) complexity on specific inputs. Users may notice a noticeable slowdown is observed with inputs above 10k characters. Users who pass user-provided strings without sanity length checks to moment constructor are vulnerable to (Re)DoS attacks. The problem is patched in 2.29.4, the patch can be applied to all affected versions with minimal tweaking. Users are advised to upgrade. Users unable to upgrade should consider limiting date lengths accepted from user input.
CVSS Score: 7.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H
CWE: CWE-400
SEVERE Vulnerabilities (9)
sonatype-2021-0900
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N
CWE: CWE-79
CVE-2018-14040
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the collapse data-parent attribute.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2018-14041
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the data-target property of scrollspy.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2018-14042
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 4.1.2, XSS is possible in the data-container property of tooltip.
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2019-11358
Improperly Controlled Modification of Object Prototype Attributes ('Prototype Pollution')
jQuery before 3.4.0, as used in Drupal, Backdrop CMS, and other products, mishandles jQuery.extend(true, {}, ...) because of Object.prototype pollution. If an unsanitized source object contained an enumerable proto property, it could extend the native Object.prototype.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-1321
CVE-2019-8331
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In Bootstrap before 3.4.1 and 4.3.x before 4.3.1, XSS is possible in the tooltip or popover data-template attribute.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
CVE-2020-11023
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
In jQuery versions greater than or equal to 1.0.3 and before 3.5.0, passing HTML containing <option> elements from untrusted sources - even after sanitizing it - to one of jQuery's DOM manipulation methods (i.e. .html(), .append(), and others) may execute untrusted code. This problem is patched in jQuery 3.5.0.
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
sonatype-2018-0607
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.1
CVSS Vector: CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
sonatype-2020-0187
Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVSS Score: 6.1
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N
CWE: CWE-79
One thing we haven't tested here is if cpus
and/or memory
are also closures. In that case, will they be evaluated before clusterOptions
? Or will they be evaluated on-the-fly when the clusterOptions invokes task.cpus
?
@bentsherman How can I test this on a production environment? If you give me some hints about how to run this exact revision I can run some tests on our current environment and return the outputs here. I can try some closures.
- clone the nextflow repo
- run
make compile
- use
./launch.sh
as a drop-in replacement for thenextflow
binary, e.g../launch.sh run test.nf
- run a test pipeline with dynamic
cpus
andclusterOptions
something like:
cpus = { 2 * task.attempt }
clusterOptions = { "-l nodes=1:${params.architecture}:ppn=${task.cpus}" }
Commands used:
git clone https://github.com/nextflow-io/nextflow
git fetch origin
git checkout -b ben/pbspro-cluster-options
cd nextflow
make compile
cd ..
./nextflow/launch.sh main.nf -c nextflow.config
qstat output:
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
33664.hpcnod11 xxxxxxxxxx sabmi nf-FOO 48529 1 2 4gb -- R 00:00:00 hpcnod03/0-1
.command.run header:
#PBS -l nodes=1:ppn=2
#PBS -l mem=4gb
#PBS -l nodes=1:x86:ppn=2
Nextflow config:
process{
executor= "pbs"
queue= "sabmi"
clusterOptions = {"-l nodes=1:${task.ext.arch}:ppn=${task.cpus}"}
withName: "FOO" {
ext.arch = "x86"
cpus = { 2 * task.attempt }
memory = "4 GB"
}
}
executor.queueSize = 15
executor.pollInterval = '5 sec'
The closure is working as intended but unfortunately we still have 2 headers specifying cpus on that.
@Mxrcon , the problem is with the git commands
- This command only switches to a new local branch (and does not pull the changes in the corresponding remote/branch)
$ git checkout -b ben/pbspro-cluster-options
Switched to a new branch 'ben/pbspro-cluster-options'
- Use the
git log --oneline
to check history and it shows that only the commits from master have been pulled
git log --oneline
811e7ca8e (HEAD -> master, origin/master, origin/HEAD, ben/pbspro-cluster-options) Add support for Java 19
0884e80e6 Add support for custom conda channels (#3435) [ci fast]
1ed2640a8 Add time directive to AWS Batch, clean language (#3436) [ci skip]
ab5bd81b9 Update err message [ci fast]
68b45c929 Fix Flux executor config (#3432) [ci fast]
...
...
Solution-1 (use gh
CLI - if you can setup on the cluster)
The CLI would do the following
- Set the upstream for the branch
- create a local branch and switch to it
- Sync with the remote branch
$ gh pr checkout 3015
Branch 'ben/pbspro-cluster-options' set up to track remote branch 'ben/pbspro-cluster-options' from 'origin'.
Switched to a new branch 'ben/pbspro-cluster-options'
Solution-2 (use vanilla git)
- Set the upstream for the branch
$ git branch --set-upstream-to=origin/ben/pbspro-cluster-options ben/pbspro-cluster-options
- Pull the changes from the upstream branch (this PR)
$ git pull --rebase
- Confirm Ben's last commit
$ git log --oneline | grep 'cbe6ef354'
cbe6ef354 Update executor.rst [ci skip]
@Mxrcon , could you please try again when you're at IEC? Looking forward to your response.
Thanks for the correction @abhi18av, It's always good to have some git-fu lessons.
Now I got it working.
- Updated .command.run header:
#PBS -N nf-FOO
#PBS -o xxxxxxx/pbs_directive_work/work/fd/2bfb91328ae6a1adc9f517d647d902/.command.log
#PBS -j oe
#PBS -q sabmi
#PBS -l nodes=1:x86:ppn=2
#PBS -l mem=4gb
- Qstat output:
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
----------------------- ----------- -------- ---------------- ------ ----- ------ --------- --------- - ---------
33771.hpcnod11 xxxxxxxxxxx sabmi nf-FOO 17008 1 2 4gb -- C -- hpcnod16/0-1
Thank you @bentsherman, This implementation seems to fix the issue even when using closures!!
Also, the warn is very nice: WARN: cpus directive is ignored when clusterOptions contains -l option tip: clusterOptions = { "-l nodes=1:ppn=${task.cpus}:..." }
@abhi18av @bentsherman @pditommaso, I extended this test running nf-core mag on test config to seem if this could be used for already existing pipelines.
Command used: ./nextflow/launch.sh run nf-core/mag -profile test,docker --outdir mag_results -c nextflow.config
nextflow.config content:
process{
executor= "pbs"
queue= "sabmi"
clusterOptions = {"-l nodes=1:hpcnod16:ppn=${task.cpus}"}
}
executor.queueSize = 15
executor.pollInterval = '5 sec'
Output:
Completed at: 28-Nov-2022 12:55:46
Duration : 10m 12s
CPU hours : 1.3 (0% failed)
Succeeded : 156
Ignored : 1
Failed : 1
Hi team,
This thing came up in another context where another university has docker installed only on a single node and then its a hard requirement to rely upon -l
cluster options to be able to send jobs to only that node.
Request for comments: Till the moment that this particular PR is merged, I'm wondering if there is a gotcha on using a custom build for nextflow using make pack
?
1. Checkout this branch locally
gh pr checkout 3015
Branch 'ben/pbspro-cluster-options' set up to track remote branch 'ben/pbspro-cluster-options' from 'origin'.
Switched to a new branch 'ben/pbspro-cluster-options'
2. Clean up any previously built assets
$~/d/p/nextflow (ben/pbspro-cluster-options)> make clean
rm -rf .nextflow*
rm -rf work
rm -rf modules/nextflow/.nextflow*
rm -rf modules/nextflow/work
rm -rf build
rm -rf modules/*/build
rm -rf plugins/*/build
./gradlew clean
Starting a Gradle Daemon, 1 busy Daemon could not be reused, use --status for details
BUILD SUCCESSFUL in 10s
24 actionable tasks: 1 executed, 23 up-to-date
3. Rely upon the make pack
utility which produces a bundled Nextflow version
$~/d/p/nextflow (ben/pbspro-cluster-options)> make pack
BUILD_PACK=1 ./gradlew packAll
> Task :nextflow:compileGroovy
Note: /home/abhinav/data/projects/nextflow/modules/nextflow/src/main/groovy/nextflow/sort/BigSort.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
> Task :plugins:nf-amazon:compileGroovy
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: /home/abhinav/data/projects/nextflow/plugins/nf-amazon/src/main/com/upplication/s3fs/S3FileSystemProvider.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
> Task :packAll
+ Nextflow package `ALL` copied to: /home/abhinav/data/projects/nextflow/build/releases
Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.
You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.
See https://docs.gradle.org/7.6/userguide/command_line_interface.html#sec:command_line_warnings
BUILD SUCCESSFUL in 45s
30 actionable tasks: 17 executed, 2 from cache, 11 up-to-date
4. Check that the bundled version is created
$~/d/p/nextflow (ben/pbspro-cluster-options)> ls /home/abhinav/data/projects/nextflow/build/releases
nextflow-22.12.0-edge-all*
$~/d/p/nextflow (ben/pbspro-cluster-options)> /home/abhinav/data/projects/nextflow/build/releases/nextflow-22.12.0-edge-all info
Version: 22.12.0-edge build 5832
Created: 16-12-2022 09:49 UTC (11:49 SAST)
System: Linux 5.15.0-43-generic
Runtime: Groovy 3.0.13 on OpenJDK 64-Bit Server VM 17.0.4+8-LTS
Encoding: UTF-8 (UTF-8)
5. Rename that version to avoid mixing with the official release (nextflow-22.12.0-pbspatch-pr3015-edge-all
)
$~/d/p/nextflow (ben/pbspro-cluster-options)> mv /home/abhinav/data/projects/nextflow/build/releases/nextflow-22.12.0-edge-all /home/abhinav/data/projects/nextflow/build/releases/nextflow-22.12.0-pbspatch-pr3015-edge-all
6. Launch the jobs using this custom build of Nextflow with the PBS patch.
@pditommaso , could you kindly let us know if there's anything pending on this PR? ✅
@abhi18av sorry for the late reply. @bentsherman I've made a few changes, please have a look
Changes look good, I would say just revert the tokenize since the space after -l
is not required.
still not super happy with .indexOf('-l')
because it would catch also any string like --some-option-like-this
I agree. Maybe we could just keep the tokenize and see what happens? 😅
Thanks for picking this back up Paolo and Ben 💚
I think that for the use-cases I have seen, the current state might suffice so perhaps we take this forward and we can circle back after taking the edge release for a spin with more nf-core and custom pipelines?
Abhinav you know that we only deliver very high-quality software! 😆
I think this can be addressed easily using a simple regex to match -l.+
or -l.+
(or something like that)
https://github.com/nextflow-io/nextflow/blob/8f079a9568da039751007b3e034a0d032e566641/modules/nextflow/src/main/groovy/nextflow/executor/PbsProExecutor.groovy#L74-L74
Done fbb4d9a2
Looks good to merge 👍
Abhinav you know that we only deliver very high-quality software! 😆
What happened to user-driven testing for edge/snapshot releases? 😉
======
Looking forward to the edge release - I know of atleast two institutions 🏫 unblocked by this!
CC @Mxrcon
They will come!
Included in version 23.02.0-edge
Looking forward to the edge release - I know of atleast two institutions school unblocked by this!
Included in version 23.02.0-edge
Thank you @pditommaso, I'll test on our production cluster and take some nf-core pipelines for testing