[Core feature] Map node array tasks do not support `@dynamic`, `@eager`, or `@workflow` nodes
Motivation: Why do you think this is important?
Map node array tasks allow users to map a @task over a list of inputs. Currently, other node types, i.e. @dynamic, @eager, and @workflow are not supported.
As a user, I would like to implement the following logic that today cannot be expressed in Flyte due to this limitation:
@workflow
def map_workflow_consume_partial_results(data: list[int] = [10, 12, 11, 10, 13, 12, 100, 11, 12, 10]):
partial_results = map_task(dynamic_subwf, min_success_ratio=0.75)(data_point=data)
consume_partial_results_task(results=partial_results)
In this example, the task consume_partial_results_task can consume a subset of the outputs of dynamic_subwf (or a normal sub workflow) which is being mapped even if not all (dynamic) sub workflows are successful.
Currently, consuming partial results can only be done when mapping a @task but not all logic can be compressed into a single task.
Goal: What should the final outcome look like, ideally?
Support every node type in map_task.
Describe alternatives you've considered
One can execute multiple (dynamic) sub workflows and tolerate failures of some of them by using WorkflowFailurePolicy.FAIL_AFTER_EXECUTABLE_NODES_COMPLETE but this does not allow consuming partial results of the successful (dynamic) sub workflows.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
Currently, map_task in Flyte does not support @dynamic, @eager, or @workflow nodes. This limitation is acknowledged in various discussions and issues within the Flyte community. According to this discussion, map_task only supports Flyte's PodPlugin tasks, such as Python tasks, container tasks, and pod tasks.
A potential workaround is to use dynamic tasks within a static workflow, but this does not fully address the need for partial result consumption. The Flyte team is working on extending support for mapping over different Flyte node types, including dynamic and subworkflows, through the introduction of ArrayNode (source). This feature is still under development and not yet available.
Would you like more details on the current limitations or the upcoming ArrayNode feature?
Sources:
This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.
Mention @runllm to ask follow-up questions. See runllm.com for more!
Wow, what timing - I just came across this same situation myself. Glad to see it's on the table.
@fg91 / @rovangju we just merged a generic capability of allowing map tasks to support arbitrary node types. This is currently in union only, we may upstream in a bit. The challenge is the state space in etcd which is restricted. We also support launchplans etc - if urgent try union?
@fg91 @rovangju This is Union Only feature for now - here is the reason,
Reason is, the fundamental way, launchplans are mapped over needs a large state storage system, which does not fit default array_nodes storage system. Union has a different engine, which can store data efficiently and in higher resolution, does surpassing these limits. Its a large lift to support in oss and we do not currently plan to support - but we will definitely visit this in a few months, as we upstream some of our learnings from Union.
"Hello 👋, this feature request has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 14 days. Thank you for your contribution and understanding! 🙏"