metaflow icon indicating copy to clipboard operation
metaflow copied to clipboard

Add a way to identify or clean up AWS resources

Open rchui opened this issue 4 years ago • 3 comments

When deploying into production we generate AWS resources using:

python <workflow>.py --with retry step-functions create

The most basic workflow:

from metaflow import FlowSpec, step


class TestFlow(FlowSpec):
    @step
    def start(self) -> None:
        print("hello world!")
        self.next(self.end)

    @step
    def end(self) -> None:
        pass


if __name__ == "__main__":
    TestFlow()

creates a AWS Batch job definition (metaflow_<some-hash>) and a AWS Step Function state machine with the same name as the FlowSpec class (TestFlow in this case). All of these resources are created transparently and (as far as I can tell) there is no way to track them in Metaflow or through AWS APIs. This makes it difficult to maintain visibility on all of the resources being created by users and to eventually go back and clean them up if necessary.

Either AWS resources created by Metaflow commands should be tagged with an AWS tag that can logically group resources together (even better if the user can specify them), and/or the step-functions subgroup command should provide a way to destroy created resources.

I'm personally more in favor of the former because it would also allow costs to be tracked along with identifying resources.

rchui avatar Nov 17 '20 18:11 rchui

Makes sense. Although not all resources can be tagged by Metaflow - for example, the underlying ECS instances used by AWS Batch can't be tagged because they will time-share with other applications.

savingoyal avatar Dec 15 '20 18:12 savingoyal

Any updates on this?

Isaac4real avatar Jan 17 '24 11:01 Isaac4real

Interested in this as well...we track all our resources with tags and it's a big limitation that we can't use them on deployed step functions. Also, there should be a straightforward way to destroy all resources that were created when deploying a step function

ryewilson avatar Jan 30 '24 18:01 ryewilson

metaflow sets tags on all AWS Batch jobs which AWS takes into account for cost reporting.

savingoyal avatar Aug 16 '24 06:08 savingoyal