flyte icon indicating copy to clipboard operation
flyte copied to clipboard

[BUG] Local and remote caching behaviour should not differ

Open fg91 opened this issue 1 year ago • 1 comments

Describe the bug

As a user, I would expect that the caching behaviour is the same when executing a workflow in a cluster vs executing it locally as a python script.

In practice, there are situations where the behaviour differs:

Expected behavior

  • [ ] Caching of tasks without return values:

    @task(cache=True, cache_version="1.0")
    def foo() -> None:
        print("Foo")
    

    Locally, this task can be cached while in a cluster execution it can't be. Flyteconsole says "Caching was disabled for this execution".

    As a user, I have a strong preference for being able to cache tasks without a return value as tasks can have side effects (like e.g. storing a resulting metric in a metadata store) which don't need a return value but are still supposed to be cached. We have multiple tasks in our code base that have a dummy return value only to allow the task to be cached.

  • [ ] Cache misses upon schema changes:

    from dataclasses import dataclass
    from dataclasses_json import dataclass_json
    from flytekit import task, workflow
    
    
    @dataclass_json
    @dataclass
    class Foo:
        a: int
        # b: int
    
    
    @task(cache=True, cache_version="1.0")
    def t1() -> Foo:
        print("Foo")
        return Foo(a=42)  #, b=42)
    
    @workflow
    def wf():
        t1()
    
    
    if __name__ == "__main__":
        wf()
    

    When executing this workflow, adding b: int to Foo as an example of a schema change, and executing again, there is an expected cache miss in the remote execution but an unexpected cache hit in the local execution. The local behaviour needs to be adapted.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • [X] Yes

Have you read the Code of Conduct?

  • [X] Yes

fg91 avatar Oct 10 '24 10:10 fg91

In case anyone observes another situation where the behaviour differs, feel free to add to this issue.

fg91 avatar Oct 10 '24 10:10 fg91

#take

luckyarthur avatar Nov 01 '24 07:11 luckyarthur

@luckyarthur , please, let me know which bugs you're going to be fixing, ok?

eapolinario avatar Nov 13 '24 01:11 eapolinario

@luckyarthur , please, let me know which bugs you're going to be fixing, ok?

I'm trying to fix this one now, sorry it takes longer time, cause I'm new to this whole system, I'm trying to locate where the logic of cache for none return value task at backend.

luckyarthur avatar Nov 22 '24 12:11 luckyarthur

@luckyarthur , for sure, I just meant which one of the two issues described in the issue you were planning to tackle.

eapolinario avatar Nov 22 '24 17:11 eapolinario

@luckyarthur , for sure, I just meant which one of the two issues described in the issue you were planning to tackle.

I'm working on both of them, since they are presented in one issue

luckyarthur avatar Nov 23 '24 00:11 luckyarthur

"Hello 👋, this issue has been inactive for over 90 days. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏"

github-actions[bot] avatar May 20 '25 00:05 github-actions[bot]

Hello 👋, this issue has been inactive for over 90 days and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

github-actions[bot] avatar May 28 '25 00:05 github-actions[bot]