graphstorm icon indicating copy to clipboard operation
graphstorm copied to clipboard

[SageMaker] Pipeline execution overrides user-defined SageMaker configuration when running locally

Open thvasilo opened this issue 10 months ago • 0 comments

SageMaker local execution allows users to configure the Docker containers using a local file under $HOME/.sagemaker/config.yaml. See https://aws.amazon.com/blogs/machine-learning/configure-and-use-defaults-for-amazon-sagemaker-resources-with-the-sagemaker-python-sdk/ for details

An example file can be:

local:
    local_code: true # Using everything locally
    region_name: "us-east-1" # Name of the region
    container_config: # Additional docker container config
        shm_size: "58G"
        environment:
          - AWS_REGION: "us-east-1"

when creating a local session this configuration is saved as a dict in a config parameter:

from sagemaker.workflow.pipeline_context import LocalPipelineSession

local_session = LocalPipelineSession()
config_dict = local_session.config

However in execute_pipeline.py we override this config to set the shm size for the container:

https://github.com/awslabs/graphstorm/blob/a145677290d1ec9f74ac0f702e98752d6cbd4ca5/sagemaker/pipeline/execute_sm_pipeline.py#L164-L172

What we should be doing instead is only update the shm_size if it's not already configured.

thvasilo avatar Mar 12 '25 00:03 thvasilo