amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

[Bug Report]

Open dan-ringwald opened this issue 2 years ago β€’ 2 comments

Link to the notebook All the PyTorch NEO compilation jobs in this directory

Describe the bug

Running the pytorch example notebooks:

unchanged, on ml.c5.xlarge, conda_pytorch_p38 kernel, yields the following error:

ClientError: An error occurred (ValidationException) when calling the CreateCompilationJob operation: Unsupported framework version field for target. Framework version is supported for Target Platform configuration and only part of target devices. 
Framework version is only supported for ml_c4, ml_c5, ml_m4, ml_m5, ml_p2, ml_p3, ml_g4dn cloud targetsand lambda, jetson_tx1, jetson_nano, jetson_tx2, jetson_xavier, deeplens, rasp3b, rasp4b, imx8qm, rk3288, rk3399, aisage, sbe_c, qcs605, qcs603, x86_win32, x86_win64 edge devices.

The same run was working on Friday morning.

To reproduce

  • Launch a sagemaker notebook instance on a ml.c5.xlarge machine, tweaking memory to 15GB and cloning the https://github.com/aws/amazon-sagemaker-examples/ repo at in startup configuration
  • Upon startup, launch one of the pytorch neo compilation notebooks (listed above) with the conda_pytorch_p38 kernel and execute in order the notebook cells.
  • The ClientError occurs at the compilation step (looks like neo_model = pytorch_model.compile(...))

What I've tried

  • I've tried changing the notebook pytorch version to every version between 1.5.1 and 1.11.0
  • I've tried changing the framework_version argument of the PyTorchModel object and the compile method to the corresponding versions.

I would appreciate any help in sorting out what is going wrong.

Dan Ringwald

dan-ringwald avatar Mar 21 '22 14:03 dan-ringwald

Hello @dan-ringwald, I encountered the same error with SageMaker Python SDK v2.80.0.

Could you check your SDK version with this command in the notebook cell?

pip show sagemaker boto3 botocore 

The workaround is to specify an older version 2.79.0. This worked for me.

!pip install -U sagemaker==2.79.0

hariby avatar Mar 23 '22 08:03 hariby

Hello @hariby, Sorry for the late reply, Next time i pop up my sagemaker compilation notebook i will double-check the version of the SDK but i am pretty sure it was the v2.80, as i checked i got the latest version. I will let you know if the version downgrade does the trick Edit: It did the trick. The new version v2.81 also triggers the error, but the 2.79 works fine γŠζ‰‹δΌγ„γ©γ†γ‚‚γ‚γ‚ŠγŒγ¨γ†

dan-ringwald avatar Mar 28 '22 14:03 dan-ringwald