sagemaker-training-toolkit icon indicating copy to clipboard operation
sagemaker-training-toolkit copied to clipboard

Don't set `sagemaker_s3_output` via hyperparameter

Open samuel-massinon opened this issue 4 years ago • 0 comments

Describe the feature you'd like I discovered an undocumented feature where we can pass a hyperparameter named sagemaker_s3_output with an S3 URI. This will result in being able to store data in opt/ml/output/intermediate and that will get uploaded to the S3 URI during the training job.

I would like to take advantage of this, though I have 2 concerns.

  1. hyperparameter should be reserved exclusively for actual hyperparameter, and not configuration information
  2. Where this becomes a real issue is if we warm start a tuning job with different sagemaker_s3_output. sagemaker_s3_output would have an impact on the tuning strategy even though it shouldn't.

How would this feature be used? Please describe. The sagemaker_s3_output value should be set via the SageMaker CreateTrainingJob Environment parameter.

The main issue with this is that SageMaker CreateHyperParameterTuningJob has no Environment parameter (this is another feature request I've submitted to the SageMaker team).

Describe alternatives you've considered We could just pass sagemaker_s3_output as a hyperparameter and just try to make sure they don't change between warm starts.

samuel-massinon avatar Apr 21 '21 15:04 samuel-massinon