dagster icon indicating copy to clipboard operation
dagster copied to clipboard

Dynamic k8s pod resources

Open johannkm opened this issue 2 years ago • 5 comments

Discussed in https://github.com/dagster-io/dagster/discussions/8001

Originally posted by Tehada May 21, 2022 According to docs, we can specify resources for op's pod using syntax below:

@op(
    tags={
        "dagster-k8s/config": {
            "container_config": {
                "resources": {
                    "requests": {"cpu": "200m", "memory": "32Mi"},
                }
            },
        }
    }
)
def my_op(context):
    context.log.info("running")

@job(executor_def=k8s_job_executor)
def my_job():
    my_op()

But what if I want to set resources for pod at runtime, as a result of upstream's op calculation? Is it possible? My usecase is to have a task, which lists gcs bucket for new files (amount of files and their sizes are variable day-to-day) and then launches pod to process each file with custom resources values -- some files require 20 GiB of RAM, while others can take up to 100 GiB.

johannkm avatar May 25 '22 14:05 johannkm

+1 to this. Dynamic resource provisioning would be helpful in my use case.

ascrookes avatar Jun 23 '22 21:06 ascrookes

+1 but would also be nice to configure k8s resource tags per op even via dagit.

fahadkh avatar Jul 08 '22 16:07 fahadkh

Hello, I am also interested.

I wonder if you could split the graphs in two.

The first graph make the calculation and the last op is Trigering a job based on this resource.

Let's say with https://docs.dagster.io/_apidocs/schedules-sensors#dagster.RunRequest ?

yield RunRequest(
    run_key=filename,
    run_config={...},
    tags={
        "dagster-k8s/config": {
            "container_config": {
                "resources": {
                    "requests": {"cpu": "**200m**", "memory": "**32Mi**"},
                }
            },
        }
    }
)

I would try this as a work around.

slamer59 avatar Aug 04 '22 12:08 slamer59

The first graph make the calculation and the last op is Trigering a job based on this resource.

Let's say with https://docs.dagster.io/_apidocs/schedules-sensors#dagster.RunRequest ?

Only problem I see is that the RunRequest is meant to be used with a sensor, not an op.

fahadkh avatar Aug 04 '22 21:08 fahadkh

I dont see why it cannot be used. I see this as a wrapper of dagster graphql API. You can make a direct call of graphql it should be the same ?

Or with Graphqlclient https://docs.dagster.io/_apidocs/libraries/dagster-graphql#dagster_graphql.DagsterGraphQLClient.submit_job_execution

slamer59 avatar Aug 05 '22 06:08 slamer59