sagemaker-python-sdk icon indicating copy to clipboard operation
sagemaker-python-sdk copied to clipboard

A library for training and deploying machine learning models on Amazon SageMaker

Results 519 sagemaker-python-sdk issues
Sort by recently updated
recently updated
newest added

**Describe the feature you'd like** Consider raising the character limit on arguments passed to scripts a little. Right now, each parameter passed to a step through `job_arguments` must be no...

type: feature request
component: processing

**What did you find confusing? Please describe.** * https://sagemaker.readthedocs.io/en/stable/overview.html#configuring-and-using-defaults-with-the-sagemaker-python-sdk * ModelTrainer has support but not in the `readthedocs` - https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/modules/train/model_trainer.py#L278 **Describe how documentation can be improved** A clear and concise...

type: documentation
component: pysdk-team

**Describe the bug** Local Mode is supposed to closely mirror the remote execution behaviour. Yet, execution ids in local mode seem to be created with [`uuid.uuid4()`](https://github.com/aws/sagemaker-python-sdk/blob/fd566bd23e6441617af7a28fb648697c2f66304c/src/sagemaker/local/entities.py#L683) ```python class _LocalPipeline(object): ......

type: bug
component: local mode

**Describe the bug** Updating an endpoint using a new model with only serverless config will fail and will throw an error `ValueError: Failed to parse instance type 'None': 'NoneType' object...

type: bug
component: Inference APIs and Interfaces

**Describe the bug** When running local mode, the cell never finishes and eventually timedout │ /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/sagemaker/local/entities.py:9 │ │ 93 in _wait_for_serving_container │ │ │ │ 990 │ while True: │...

type: bug
component: local mode

# TensorBoard get_url feature not creating proper redirect link to SageMaker Studio ## Describe the bug The `get_app_url` method in the TensorBoard application class (`sagemaker.interactive_apps.tensorboard.TensorBoardApp`) does not create proper redirect...

type: bug
component: pysdk-team

**Describe the bug** With ModelTrainer, when I'm using the command parameter in the SourceCode with an argument provided as part of the command, for example `python launcher.py -e test.py`, hyperparameters...

type: bug
component: training

**Describe the bug** PyTorch estimator doesn't allow to setup a checkpoint_s3_uri when I'm working with an heterogeneous cluster, by returning the following error: ``` │ /Users/bpistone/miniforge3/envs/ray-env/lib/python3.12/site-packages/sagemaker/estimator.py:3646 │ │ in _validate_and_set_debugger_configs...

type: bug
component: training

**Describe the bug** I'm trying to train a model using the `sagemaker.modules.train.ModelTrainer` API. However, it keeps trying to validate the SageMaker session using Pydantic, only never to accept any possible...

type: bug
component: training