metaflow icon indicating copy to clipboard operation
metaflow copied to clipboard

Missing artifact in the step.task.data._artifacts

Open super-shayan opened this issue 1 year ago • 1 comments

Hello, I am using Metaflow and scheduling parallel jobs on AWS Step function. My Flow script is as follows:

start() -> run() -> join() -> end() in the start() i am using foreach to call run() in parallel. I tried using the Client API to access the data in the run() step, as follows:

flow = Flow('MyFlow')
step = Step('MyFlow/sfn-id/run')
list(step.tasks())#there are multiple of these available, i choose one below:
step.task#is something like: MyFlow/sfn-id/run/task-id
step.task.data#gives the output: <MetaflowData: >

Since I have attached a variable called "self.results" to the run() step, I expected to access the results by calling the following: step.task.data.results, but this raises a keyError:

File /anaconda3/lib/python3.11/site-packages/metaflow/client/core.py:738, in MetaflowData.__getattr__(self, name)
    737 def __getattr__(self, name: str):
--> 738     return self._artifacts[name].data

KeyError: 'results'

I also tried to see what artifacts are there by calling:

step.task.data._artifacts but that returns a Null set

super-shayan avatar Mar 19 '24 10:03 super-shayan

correct - you might be running into this expected behavior. The link has details on how to correctly move state through a foreach. Let me know if that works!

savingoyal avatar Mar 27 '24 21:03 savingoyal