nextflow
nextflow copied to clipboard
feature: nullable in input/output path
new feature to allow nonexisting paths in input and output parameters
closes #2678
p.d.
it has 2 tests ignored in ScriptDslTest
because with them ScriptRunnerTest
fails by Timeout but don't know why
In the end, I think I've found the way to run all tests although it's a little tricky and probably we need to investigate deeper
Don't know why/where if we run in the same ./gradlew test
command InputNullableTest + OutputNullableTest + ScriptDslTest some of them fail and I think it's due to some static variable related to the execution is not refreshed
So to solve this situation I exclude InputNullable and OutputNullable from the tests and run them in an isolated test
Why do we prefer nullable versus optional? The latter could have type Option. Which is more idiomatic of something that is optional? Also, nullable is less intuitive to folks who don’t know what null means.
The rationale was described at point 3 here. To summarise we are using already optional
with a different semantic that's when an optional value is absent no value is emitted.
This is quite different from emitting a value with a null
value. Therefore using optional
for both can be quite confusing.
Are we calling it allowNull
or nullable
? I think we should call it nullable
like an adjective.
I agree, nullable
sounds better
We need to improve the e2e pipeline tests, ideally we should have one process emit a nullable path to another process, and it succeeds or fails based on whether or not the nullable paths are specified correctly.
Also, the original motivation for this feature was a tuple that contains a path. I wonder what happens if a process emits a nullable path by itself. I guess it's just a null value?
:warning: 7 God Classes were detected by Lift in this project. Visit the Lift web console for more details.
@jorgeaguileraseqera This was not supported to be merged. Above all it looks like you force-pushed your branch overriding the changes in this PR.
Do you have a local copy?
Yes, I have it
I'm reviewing it and trying to reopen the PR.
Sorry for inconveniences
Just to confirm that this would close https://github.com/nextflow-io/nextflow/issues/1694? If so, could you link the issue from this PR (so merging automatically closes)?
Not quite. This PR pertains only to nullable paths, whereas that issue pertains to optional inputs in general, and I don't think we have fully worked out the meaning of ?optional", i.e. passing a null or passing nothing. Also, I think #1694 is tied up with #2257 , because you might want to have named inputs so that you can provide only the inputs you want.
@pditommaso I cleaned up this PR and updated the e2e tests. Also fixed a bug with the nullable input path. I think this PR is ready to go.
Here's the output of the main e2e test, in which process foo emits a nullable path and process bar accepts it:
$ ./launch.sh tests/nullable-path-succeeds.nf
N E X T F L O W ~ version 23.02.0-edge
Launching `tests/nullable-path-succeeds.nf` [naughty_fermi] DSL2 - revision: 41393792fb
executor > local (2)
[de/f4993a] process > foo (1) [100%] 1 of 1 ✔
[51/badde8] process > bar (1) [100%] 1 of 1 ✔
foo
output.txt
output.txt
WARN: File system is unable to get file attributes file: nextflow.util.NullPath@7f -- Cause: java.nio.file.ProviderMismatchException
It works but maybe we should figure out how to suppress the file attributes warning?
In case the unit tests don't run, I confirmed that they pass locally:
$ make smoke
NXF_SMOKE=1 ./gradlew test
BUILD SUCCESSFUL in 3m 14s
91 actionable tasks: 39 executed, 52 up-to-date
While I understand the difference between optional and nullable, I'm wondering if we shouldn't just extend optional
to be allowed within tuple elements and handle it accordingly, rather than adding a new option. In other words:
- if a tuple or standalone path is declared optional, then emit nothing
- if a nested path is declared optional, then set the tuple element to NullPath
The only thing you lose that way is making a standalone path nullable vs optional, but I don't think that is needed.
I agree trying to not overload the declaration with new keyword, but what about point 3 here (I copy it below for your convenience):
I think using optional for both the existing mechanism and the new one could bring a serious semantics inconstancy. When using the former the output is NOT emitted when missing. Using the new approach is emitted with null. Imagine the explain the difference between output: val(x) optional and output: val(x, optional:true).
I would avoid that issue by having the nullable behavior only apply to nested params in a tuple, that way there is no distinction between val(x), optional: true
and val(x, optional: true)
.
However, your comment here makes me think there is a case for standalone paths to be nullable. So now I am leaning back towards having a separate nullable
option as it is currently implemented.
Hi there - is there any update on whether this will be merged into the main branch and released with the next version of nextflow? I have a use case in my current pipeline that would really benefit from this (specifically, using the remainder:true option when joining two channels that I pass onto a subsequent process).
I do need to fix the merge conflict, but otherwise it's just waiting for @pditommaso 's approval.
Closing in favor of #3706