sagemaker-studio-custom-image-samples icon indicating copy to clipboard operation
sagemaker-studio-custom-image-samples copied to clipboard

conda-env-kernel-image example is broken

Open tom-mcclintock opened this issue 2 years ago • 5 comments

After following the steps listed here exactly I began a SageMaker Studio session. After creating selecting the custom image and beginning a console I received the following error:

Invalid response: 404 Not Found
Kernel with name [myenv] does not exist in image [arn:aws:sagemaker:REGION:ACCOUNT_ID:image/conda-test-kernel] on the KernelGateway App [conda-test-kernel-ml-t3-medium-HASH]. To make the kernel available, either update your AppImageConfig to have same kernel name as available in the image or update your SageMaker Image to have the kernel with the same name as specified in AppImageConfig. You can use https://github.com/aws-samples/sagemaker-studio-custom-image-samples/blob/main/DEVELOPMENT.md#local-testing for testing your image locally.

The Dockerfile and environment.yml are identical to the example. Here is the app-image-config-input.json file:

{
    "AppImageConfigName": "myenv-config",
    "KernelGatewayImageConfig": {
        "KernelSpecs": [
            {
                "Name": "myenv",
                "DisplayName": "Python [conda env: myenv]"
            }
        ],
        "FileSystemConfig": {
            "MountPath": "/home/sagemaker-user",
            "DefaultUid": 0,
            "DefaultGid": 0
        }
    }
}

And here is the anonymized create-domain-input.json contents:

{
    "DomainId": "d-xxxxxxxxx",
    "DefaultUserSettings": {
        "ExecutionRole": "ROLE_ARN",
        "KernelGatewayAppSettings": {
            "CustomImages": [
                {
                    "ImageName": "conda-test-kernel",
                    "AppImageConfigName": "myenv-config"
                }
            ]
        }
    }
}

I used IMAGE_NAME=conda-test-kernel throughout. Other things to note:

  • aws sagemaker describe-image-version shows "ImageVersionStatus": "CREATED"
  • aws sagemaker describe-app-image-config gives back all the expected information

I believe the issue is that conda doesn't automatically follow the kernelspec. This quirk needs to be covered in the README for this example. Unfortunately I haven't figure out the solution yet. Any help is appreciated.

tom-mcclintock avatar Feb 28 '22 21:02 tom-mcclintock