graphstorm
graphstorm copied to clipboard
[SageMaker] Pipeline execution overrides user-defined SageMaker configuration when running locally
SageMaker local execution allows users to configure the Docker containers using a local file under $HOME/.sagemaker/config.yaml. See https://aws.amazon.com/blogs/machine-learning/configure-and-use-defaults-for-amazon-sagemaker-resources-with-the-sagemaker-python-sdk/ for details
An example file can be:
local:
local_code: true # Using everything locally
region_name: "us-east-1" # Name of the region
container_config: # Additional docker container config
shm_size: "58G"
environment:
- AWS_REGION: "us-east-1"
when creating a local session this configuration is saved as a dict in a config parameter:
from sagemaker.workflow.pipeline_context import LocalPipelineSession
local_session = LocalPipelineSession()
config_dict = local_session.config
However in execute_pipeline.py we override this config to set the shm size for the container:
https://github.com/awslabs/graphstorm/blob/a145677290d1ec9f74ac0f702e98752d6cbd4ca5/sagemaker/pipeline/execute_sm_pipeline.py#L164-L172
What we should be doing instead is only update the shm_size if it's not already configured.