sagemaker-debugger
sagemaker-debugger copied to clipboard
Cannot run a custom container using smdistributed/dataparallel unless USE_SMDEBUG is turned off
After countless hours of trying to get an Estimator() to run on a custom image_uri in smdistributed/dataparllel mode (it was failing on trying to import any non-sagemaker-DLC library), I finally discovered buried in the sagemaker.huggingface.HuggingFace estimator that in its API req to sagemaker, it adds the env var
"USE_SMDEBUG": "0"
I added this to my custom docker container and suddenly everything worked. Imports from custom libraries worked no problem.
Is this documented anywhere?