sagemaker-python-sdk
sagemaker-python-sdk copied to clipboard
sagemaker.session.download_data() is unable to download S3 content.
Describe the bug
If nested objects from a S3 bucket is downloaded to a temp file using boto3.client.download_file(), and then re-uploaded to another S3 bucket using boto3.client.upload_file(), sagemaker.session.download_data() is unable to download the re-uploaded objects, failing with error: [Errno 21] Is a directory: ....
To reproduce
- Create a source S3 bucket:
source-s3-bucket
anddestination-s3-bucket
, assign it the proper permissions. - Create a simple nested content in the
source-s3-bucket
, such as: dir1/file1 - Use the following code to download and then re-upload the content:
import boto3
import sagemaker
s3_resource = boto3.resource("s3")
s3_client = boto3.client("s3", region_name="us-west-2")
sagemaker_session = sagemaker.Session()
# Upload local file to a S3 using s3_client in faulty order
source_bucket = ##source-s3-bucket##
dest_bucket = ##destination-s3-bucket##
for obj in s3_resource.Bucket(source_bucket).objects.filter(Prefix=""):
s3_client.download_file(source_bucket, obj.key, "temp.file")
s3_client.upload_file("temp.file", dest_bucket, obj.key)
sagemaker_session.download_data(path=".", bucket= ##destination-s3-bucket##, key_prefix="")
Expected behavior
sagemaker_session.download_data() should be able to download the content from destination-s3-bucket
to local directory.
aws cli is able to do it using:
!aws s3 cp --recursive ##destination-s3-bucket## "./"
Screenshots or logs
System information
- SageMaker Python SDK version: sagemaker 2.109.0 awscli 1.25.72 boto3 1.24.71
- Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
- Framework version: N/A
- Python version: 3.8.10
- CPU or GPU: Both
- Custom Docker image (Y/N): N.
Additional context
Should be reproducible in Sagemaker Studio using PyTorch 1.10 Python 3.8 CPU and PyTorch 1.10 Python 3.8 GPU using the provided library versions or upgrading them to the latest version, as sagemaker->2.109.0, awscli->1.25.72, boto3->1.24.71, as of the writing, using !pip install -U sagemaker boto3 awscli
.