sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

Hyperparameters not provided when using a `.sh` script as Estimator entrypoint

Open Guillem96 opened this issue 2 years ago • 3 comments

Describe the bug When setting a .sh script as an Estimator entrypoint I do not get the hyperparameters as CLI options.

To reproduce

# train.sh
echo "python train.py $@"  # For some reason $@ is empty
# run_train_job.py
from sagemaker.pytorch.estimator import PyTorch

input_path = ...
instance_type = "ml.m5.large"

estimator = PyTorch(
        entry_point="train.sh",
        source_dir=".",
        instance_type=instance_type,
        instance_count=1,
        framework_version="1.10",
        py_version="py38",
        hyperparameters={"test": "test-value"},
)
estimator.fit({"train": input_path}, job_name=f"jk-training-job-{job_suffix}")

Expected behavior The train.sh files echos: python tain.py --test test-value. But actually the $@ is empty.

Additional context

Checking the cloudwatch logs, I see that sagemaker tries to run my script like so:

$ /bin/sh -c train.sh  --test test-value

If I replicate this command locally, I can easily reproduce the error. I've realized that's the -c option that "empties" the $@ (and other special variables). Is there any particular reason to use that -c flag?

System information A description of your system. Please provide:

  • SageMaker Python SDK version: 2.88.3
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
  • Framework version: 1.10
  • Python version: py38
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Guillem96 avatar May 11 '22 08:05 Guillem96

Right now we are dealing with this issue too using the following snipped.

# Handle user arguments
if [ -n "${SM_USER_ARGS}" ]; then
    user_args=`echo "
import os
print(
    \" \".join(
        map(str, eval(os.environ[\"SM_USER_ARGS\"]))
    )
)" | python3`
else
    user_args="${@}"
fi

It works but just looking at it feels bad ...

jponf avatar May 27 '22 13:05 jponf

Thanks! Good workaround. 👌🏼 But I am still waiting for a fix or at least an explanation from SageMaker team

Guillem96 avatar Jun 03 '22 09:06 Guillem96

According to @jponf's workaround, it seems that the hyper-parameters are provided to the entry point script through the SM_USER_ARGS environment variable. @Guillem96 you seem to have a bash script that launches Python, but you could directly pass the path to your Python file, right? And from there read the SM_USER_ARGS environment variable more easily. Would that solve your issue?

davidbrochart avatar Sep 21 '23 15:09 davidbrochart