Add Array Node Index To current_context [Core feature]
Motivation: Why do you think this is important?
In many cases, the index of array node in map tasks is needed, especially if the task is programmed to output a file with a static name that will be overwritten by subsequent array nodes. the index will help add a good name to that file for example and can be backtraced.
Goal: What should the final outcome look like, ideally?
In the returned current_context() value, a filed of node_index or a good name should be there representing the map_task index.
Describe alternatives you've considered
I generated a random id for each instance to distinguish which output came from which node.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
Thank you for opening your first issue here! 🛠
To add an index to the current_context for array nodes, you can propose this feature enhancement in Flyte. According to Flyte 1.9 release notes, Flyte introduced ArrayNode map tasks, which are an evolution of the existing map tasks, providing more robust support for subtask executions. However, the current current_context does not include an index for array nodes.
Proposed Solution:
- Motivation: Adding an index to the
current_contextwould help uniquely identify each array node's output, preventing overwrites and aiding in debugging. - Goal: Modify the
current_contextto include anode_indexfield representing the map_task index. - Alternatives: Generating random IDs for each instance, although this is less efficient and harder to trace.
Would you like more details on how to proceed with this proposal or how to implement a workaround?
Example Code Block:
from flytekit import current_context
@task
def my_task():
ctx = current_context()
node_index = ctx.execution_id.node_id # Hypothetical field addition
print(f"Node Index: {node_index}")
Sources:
This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.
Mention @runllm to ask follow-up questions. See runllm.com for more!
This is a simple addition, The index is already available in the environment and we should absolutely add it to the current context.
Currently it is calculated in the array node itself - here
We should add it to the execution parameters like other information here
Follow the pattern above
Please assign this task to me 🙌
Can I work on this issue?
i've assigned this issue to @Samruddhi345A, but we encourage @Atharva1723 and others interested to submit a pull request as well. the first valid pull request will be merged.
"Hello 👋, this feature request has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 14 days. Thank you for your contribution and understanding! 🙏"
Have raised a pr for this