flyte icon indicating copy to clipboard operation
flyte copied to clipboard

Add Array Node Index To current_context [Core feature]

Open nhattab303 opened this issue 1 year ago • 8 comments

Motivation: Why do you think this is important?

In many cases, the index of array node in map tasks is needed, especially if the task is programmed to output a file with a static name that will be overwritten by subsequent array nodes. the index will help add a good name to that file for example and can be backtraced.

Goal: What should the final outcome look like, ideally?

In the returned current_context() value, a filed of node_index or a good name should be there representing the map_task index.

Describe alternatives you've considered

I generated a random id for each instance to distinguish which output came from which node.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

  • [X] Yes

Have you read the Code of Conduct?

  • [X] Yes

nhattab303 avatar Jul 02 '24 18:07 nhattab303

Thank you for opening your first issue here! 🛠

welcome[bot] avatar Jul 02 '24 18:07 welcome[bot]

To add an index to the current_context for array nodes, you can propose this feature enhancement in Flyte. According to Flyte 1.9 release notes, Flyte introduced ArrayNode map tasks, which are an evolution of the existing map tasks, providing more robust support for subtask executions. However, the current current_context does not include an index for array nodes.

Proposed Solution:

  1. Motivation: Adding an index to the current_context would help uniquely identify each array node's output, preventing overwrites and aiding in debugging.
  2. Goal: Modify the current_context to include a node_index field representing the map_task index.
  3. Alternatives: Generating random IDs for each instance, although this is less efficient and harder to trace.

Would you like more details on how to proceed with this proposal or how to implement a workaround?

Example Code Block:

from flytekit import current_context

@task
def my_task():
    ctx = current_context()
    node_index = ctx.execution_id.node_id  # Hypothetical field addition
    print(f"Node Index: {node_index}")

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @runllm to ask follow-up questions. See runllm.com for more!

runllm[bot] avatar Jul 02 '24 18:07 runllm[bot]

This is a simple addition, The index is already available in the environment and we should absolutely add it to the current context.

Here is where it is declared

Currently it is calculated in the array node itself - here

We should add it to the execution parameters like other information here

Follow the pattern above

kumare3 avatar Jul 21 '24 19:07 kumare3

Please assign this task to me 🙌

Samruddhi345A avatar Oct 02 '24 17:10 Samruddhi345A

Can I work on this issue?

Atharva1723 avatar Oct 06 '24 10:10 Atharva1723

i've assigned this issue to @Samruddhi345A, but we encourage @Atharva1723 and others interested to submit a pull request as well. the first valid pull request will be merged.

samhita-alla avatar Oct 07 '24 16:10 samhita-alla

"Hello 👋, this feature request has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 14 days. Thank you for your contribution and understanding! 🙏"

github-actions[bot] avatar Jul 05 '25 00:07 github-actions[bot]

Have raised a pr for this

Atharva1723 avatar Oct 04 '25 07:10 Atharva1723