Types of changes

New feature (non-breaking change which adds functionality)

Summary

This PR adds support for returning total or partial mappings of the output spec by the wrapped function, as an alternative to going through a conversion from the returned tuple to a dictionary via output_names. Unlike the named tuple case where the shape of the data is known a priori (I have got exactly 2 output variables which I want named to foo and bar), this PR adds support for functions which may return more, less or exactly these data as an arbitrary mapping.

This could be particularly useful for dynamic task construction use cases, like BIDS readers, parsers and writers.

Checklist

[ ] I have added tests to cover my changes (if necessary)
[ ] I have updated documentation (if necessary)

Aug 16 '22 13:08 ghisvail

For a concrete example, it would simplify this piece of terrible code very much:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils/read_bids.py

Aug 16 '22 13:08 ghisvail

A safer alternative could also be to only return if the mapping returned by the inner function contains all keys declared in output_names. Right now, I went for the coalesce to None option, since this is what we are going for in case the function did not return anything. I am happy with either.

Aug 16 '22 13:08 ghisvail

a side note: for dynamic tasks as being targeted here, the cache checksum/location will only depend on inputs (e.g. the query string/template), not the underlying database. hence either the source directory would need to be provided and cached as a separate input(a potentially expensive operation), or another default input should be associated that uses a random number generator as a default. the current forcing option of self.rerun_task may not suffice in all cases, especially if the same task gets called multiple times.

Aug 16 '22 15:08 satra

Good point. How does caching work today with nipype for things like BIDSDataGrabber and DataSink? Would it work differently with pydra fundamentally?

Aug 16 '22 16:08 ghisvail

How does caching work today with nipype for things like BIDSDataGrabber and DataSink?

caching is local to a workflow and nipype1 does not have the concept of reusing the same task in multiple places or even across workflows. this is only available in pydra. in nipype1, simply setting the rerun flag would force that interface to rerun, but because there is no shared rerun capabilities, this particular issue does not come into play. so dynamic interfaces are simply set to rerun by default in nipype1.

Aug 16 '22 18:08 satra

Codecov Report

Base: 81.23% // Head: 38.26% // Decreases project coverage by -42.96% :warning:

Coverage data is based on head (0f8869b) compared to base (a9a116d). Patch coverage: 0.00% of modified lines in pull request are covered.

:exclamation: Current head 0f8869b differs from pull request most recent head b89f9a2. Consider uploading reports for the commit b89f9a2 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           master     #565       +/-   ##
===========================================
- Coverage   81.23%   38.26%   -42.97%     
===========================================
  Files          20       20               
  Lines        4391     4393        +2     
  Branches     1262     1202       -60     
===========================================
- Hits         3567     1681     -1886     
- Misses        820     2416     +1596     
- Partials        4      296      +292

Flag	Coverage Δ
unittests	`38.24% <0.00%> (-43.00%)`	:arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pydra/engine/task.py	`47.09% <0.00%> (-46.45%)`	:arrow_down:
pydra/tasks/__init__.py	`0.00% <0.00%> (-100.00%)`	:arrow_down:
pydra/mark/functions.py	`17.64% <0.00%> (-82.36%)`	:arrow_down:
pydra/engine/audit.py	`27.50% <0.00%> (-70.00%)`	:arrow_down:
pydra/engine/state.py	`30.86% <0.00%> (-65.54%)`	:arrow_down:
pydra/engine/graph.py	`31.89% <0.00%> (-62.80%)`	:arrow_down:
pydra/engine/helpers_state.py	`35.08% <0.00%> (-60.53%)`	:arrow_down:
pydra/utils/messenger.py	`47.94% <0.00%> (-47.95%)`	:arrow_down:
pydra/engine/helpers.py	`39.00% <0.00%> (-47.76%)`	:arrow_down:
pydra/engine/specs.py	`48.21% <0.00%> (-46.56%)`	:arrow_down:
... and 9 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Aug 19 '22 16:08 codecov[bot]

@ghisvail - can't access the link with the example anymore.

perhaps you can also add some test, e.g. pydra/mark/tests/test_functions.pyand/orpydra/engine/tests/test_task.py`

Aug 19 '22 17:08 djarecka

Yes, I am planning to add some unit tests covering this particular usage.

I'll ping you once done :+1

Aug 20 '22 15:08 ghisvail

Yes, I am planning to add some unit tests covering this particular usage.

I'll ping you once done :+1

Aug 20 '22 15:08 ghisvail

I added a test case which covers present, absent and undefined mappings from returned dict to annotation.

A natural extension to this PR would be to handle functions annotated with a TypedDict return value.

Dec 31 '22 11:12 ghisvail

@ghisvail - can't access the link with the example anymore.

I did some refactoring of the code in the meantime. This is what it looks like today:

https://github.com/aramis-lab/pydra-bids/blob/main/pydra/tasks/bids/utils.py

Granted it's likely a terrible way of making dynamic tasks for reasons I have not thought of, but I sense there could be a nice pattern emerging from this.

Dec 31 '22 11:12 ghisvail

@djarecka do you think we could get this one in for the release too?

Feb 17 '23 15:02 ghisvail

I was also thinking that extending this feature to TypedDict may help reduce typing redundancy by doing:

class Point(TypedDict):
    x: int
    y: int

@task
def random_point() -> Point:
    return Point(x=random.randrange(16), y=random.randrange(16))

And have the resulting task automatically expose x and y as integer-typed output.

Maybe?

Feb 17 '23 15:02 ghisvail

if you think that you could use it, then sure

Feb 17 '23 15:02 djarecka

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

Feb 17 '23 15:02 effigies

Going forward, I think it would be good to deprecate {'name': type} for TypedDict, which didn't exist when we initially came up with the return annotations.

We might not need to deprecate it, since TypedDict can be constructed dynamically. So we could normalize tasks internals to use TypedDict but keep annotate as a user-friendly wrapper to help generate the TypedDict definition from the mapping passed in annotate.

What do you think of this plan?

Feb 17 '23 16:02 ghisvail

Sounds good to me.

Feb 17 '23 16:02 effigies

Ok, then I'll find some time to start hacking on it 👍

Feb 17 '23 16:02 ghisvail

@ghisvail - you want to still work in this PR, or you want me to merge this one and open a new one?

Feb 22 '23 01:02 djarecka

@djarecka please go ahead with merging. I'll open a follow-up PR for TypedDict.

Feb 25 '23 08:02 ghisvail

pydra
pydra copied to clipboard

Add support for arbitrary mappings in a FunctionTask

Types of changes

Summary

Checklist

Codecov Report

pydra pydra copied to clipboard

Add support for arbitrary mappings in a FunctionTask

Types of changes

Summary

Checklist

Codecov Report

pydra
pydra copied to clipboard