amazon-sagemaker-examples icon indicating copy to clipboard operation
amazon-sagemaker-examples copied to clipboard

[Bug Report]

Open acere opened this issue 2 years ago • 12 comments

Link to the notebook Train an TensorFlow model with a SageMaker Training Job and track it using SageMaker Experiments Describe the bug When executing the notebook the model training (8th cell in the notebook) fails with

ParamValidationError: Parameter validation failed:
Unknown parameter in ProfilerConfig: "DisableProfiler", must be one of: S3OutputPath, ProfilingIntervalInMilliseconds, ProfilingParameters

Bugs replicated in SageMaker Studio domains in ap-southeast-1 and us-east-2

To reproduce Run the notebook step by step

Logs Error trace:

INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
INFO:sagemaker.image_uris:image_uri is not presented, retrieving image_uri based on instance_type, framework etc.
INFO:sagemaker:Creating training-job with name: tensorflow-training-2022-12-20-10-27-40-801

---------------------------------------------------------------------------
ParamValidationError                      Traceback (most recent call last)
<ipython-input-8-952c129da21f> in <module>
     30     )
     31 
---> 32     est.fit()

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline_context.py in wrapper(*args, **kwargs)
    270             return _StepArguments(retrieve_caller_name(self_instance), run_func, *args, **kwargs)
    271 
--> 272         return run_func(*args, **kwargs)
    273 
    274     return wrapper

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config)
   1128 
   1129         experiment_config = check_and_get_run_experiment_config(experiment_config)
-> 1130         self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
   1131         self.jobs.append(self.latest_training_job)
   1132         if wait:

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in start_new(cls, estimator, inputs, experiment_config)
   2046         train_args = cls._get_train_args(estimator, inputs, experiment_config)
   2047 
-> 2048         estimator.sagemaker_session.train(**train_args)
   2049 
   2050         return cls(estimator.sagemaker_session, estimator._current_job_name)

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in train(self, input_mode, input_config, role, job_name, output_config, resource_config, vpc_config, hyperparameters, stop_condition, tags, metric_definitions, enable_network_isolation, image_uri, algorithm_arn, encrypt_inter_container_traffic, use_spot_instances, checkpoint_s3_uri, checkpoint_local_path, experiment_config, debugger_rule_configs, debugger_hook_config, tensorboard_output_config, enable_sagemaker_metrics, profiler_rule_configs, profiler_config, environment, retry_strategy)
    625             self.sagemaker_client.create_training_job(**request)
    626 
--> 627         self._intercept_create_request(train_request, submit, self.train.__name__)
    628 
    629     def _get_train_request(  # noqa: C901

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in _intercept_create_request(self, request, create, func_name)
   4654             func_name (str): the name of the function needed intercepting
   4655         """
-> 4656         return create(request)
   4657 
   4658 

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in submit(request)
    623             LOGGER.info("Creating training-job with name: %s", job_name)
    624             LOGGER.debug("train request: %s", json.dumps(request, indent=4))
--> 625             self.sagemaker_client.create_training_job(**request)
    626 
    627         self._intercept_create_request(train_request, submit, self.train.__name__)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    528                 )
    529             # The "self" in this scope is referring to the BaseClient.
--> 530             return self._make_api_call(operation_name, kwargs)
    531 
    532         _api_call.__name__ = str(py_operation_name)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    922             endpoint_url=endpoint_url,
    923             context=request_context,
--> 924             headers=additional_headers,
    925         )
    926         resolve_checksum_context(request_dict, operation_model, api_params)

/opt/conda/lib/python3.7/site-packages/botocore/client.py in _convert_to_request_dict(self, api_params, operation_model, endpoint_url, context, headers, set_user_agent_header)
    989         )
    990         request_dict = self._serializer.serialize_to_request(
--> 991             api_params, operation_model
    992         )
    993         if not self._client_config.inject_host_prefix:

/opt/conda/lib/python3.7/site-packages/botocore/validate.py in serialize_to_request(self, parameters, operation_model)
    379             )
    380             if report.has_errors():
--> 381                 raise ParamValidationError(report=report.generate_report())
    382         return self._serializer.serialize_to_request(
    383             parameters, operation_model

ParamValidationError: Parameter validation failed:
Unknown parameter in ProfilerConfig: "DisableProfiler", must be one of: S3OutputPath, ProfilingIntervalInMilliseconds, ProfilingParameters

SageMaker Python SDK version: 2.125.0 Boto3 version: 1.26.33

output of pip list:

Package                              Version
------------------------------------ -----------------
absl-py                              1.3.0
aiobotocore                          2.4.1
aiohttp                              3.8.3
aioitertools                         0.11.0
aiosignal                            1.3.1
alabaster                            0.7.12
anaconda-client                      1.7.2
anaconda-project                     0.8.3
ansi2html                            1.8.0
anyio                                3.6.2
argh                                 0.26.2
argon2-cffi                          21.3.0
argon2-cffi-bindings                 21.2.0
asn1crypto                           1.3.0
astroid                              2.12.13
astropy                              4.0
astunparse                           1.6.3
async-timeout                        4.0.2
asynctest                            0.13.0
atomicwrites                         1.3.0
attrs                                22.1.0
autopep8                             1.4.4
autovizwidget                        0.20.0
awscli                               1.27.24
Babel                                2.11.0
backcall                             0.1.0
backports.shutil-get-terminal-size   1.0.0
beautifulsoup4                       4.8.2
bitarray                             1.2.1
bkcharts                             0.2
bleach                               5.0.1
bokeh                                1.4.0
boto                                 2.49.0
boto3                                1.26.33
botocore                             1.29.33
Bottleneck                           1.3.2
brotlipy                             0.7.0
cached-property                      1.5.2
cachetools                           5.2.0
certifi                              2022.9.24
cffi                                 1.15.0
chardet                              3.0.4
charset-normalizer                   2.0.4
Click                                7.0
cloudpickle                          2.2.0
clyent                               1.2.2
colorama                             0.4.3
conda                                22.9.0
conda-package-handling               1.8.1
contextlib2                          0.6.0.post1
cryptography                         38.0.4
cycler                               0.10.0
Cython                               0.29.15
cytoolz                              0.10.1
dash                                 2.7.0
dash-core-components                 2.0.0
dash-html-components                 2.0.0
dash-table                           5.0.0
dask                                 2022.2.0
decorator                            4.4.1
defusedxml                           0.6.0
diff-match-patch                     20181111
dill                                 0.3.6
distributed                          2022.2.0
distro                               1.8.0
docker                               6.0.1
docker-compose                       1.29.2
dockerpty                            0.4.1
docopt                               0.6.2
docutils                             0.16
dparse                               0.6.2
entrypoints                          0.3
et-xmlfile                           1.0.1
fastcache                            1.1.0
fastjsonschema                       2.16.2
filelock                             3.0.12
flake8                               3.7.9
Flask                                1.1.1
flatbuffers                          22.12.6
frozenlist                           1.3.3
fsspec                               2022.11.0
future                               0.18.2
gast                                 0.4.0
gevent                               1.4.0
glob2                                0.7
gmpy2                                2.0.8
google-auth                          2.15.0
google-auth-oauthlib                 0.4.6
google-pasta                         0.2.0
greenlet                             0.4.15
grpcio                               1.51.1
h5py                                 2.10.0
hdijupyterutils                      0.20.0
HeapDict                             1.0.1
html5lib                             1.0.1
hypothesis                           5.5.4
idna                                 2.8
imageio                              2.6.1
imagesize                            1.2.0
importlib-metadata                   4.13.0
intervaltree                         3.0.2
ipykernel                            5.1.4
ipython                              7.34.0
ipython_genutils                     0.2.0
ipywidgets                           7.5.1
isort                                4.3.21
itsdangerous                         1.1.0
jdcal                                1.4.1
jedi                                 0.18.2
jeepney                              0.4.2
Jinja2                               3.1.2
jmespath                             1.0.1
joblib                               0.14.1
json5                                0.9.1
jsonschema                           3.2.0
jupyter                              1.0.0
jupyter_client                       7.4.8
jupyter-console                      6.1.0
jupyter_core                         4.12.0
jupyter-dash                         0.4.2
jupyter-server                       1.23.3
jupyterlab                           1.2.21
jupyterlab-pygments                  0.2.2
jupyterlab-server                    1.0.6
keras                                2.11.0
keyring                              21.1.0
kiwisolver                           1.1.0
lazy-object-proxy                    1.4.3
libarchive-c                         2.8
libclang                             14.0.6
lief                                 0.9.0
llvmlite                             0.39.1
locket                               0.2.0
lxml                                 4.9.1
Markdown                             3.4.1
MarkupSafe                           2.1.1
matplotlib                           3.1.3
matplotlib-inline                    0.1.6
mccabe                               0.6.1
mistune                              0.8.4
mkl-fft                              1.0.15
mkl-random                           1.1.0
mkl-service                          2.3.0
mock                                 4.0.1
more-itertools                       8.2.0
mpmath                               1.1.0
msgpack                              0.6.1
multidict                            6.0.3
multipledispatch                     0.6.0
multiprocess                         0.70.14
nbclassic                            0.4.8
nbclient                             0.7.2
nbconvert                            6.5.4
nbformat                             5.7.0
nest-asyncio                         1.5.6
networkx                             2.4
nltk                                 3.7
nose                                 1.3.7
notebook                             6.5.2
notebook_shim                        0.2.2
numba                                0.56.4
numexpr                              2.7.1
numpy                                1.21.6
numpydoc                             0.9.2
oauthlib                             3.2.2
olefile                              0.46
openpyxl                             3.0.3
opt-einsum                           3.3.0
packaging                            20.1
pandas                               1.3.5
pandocfilters                        1.4.2
parso                                0.8.3
partd                                1.1.0
path                                 13.1.0
pathlib2                             2.3.5
pathos                               0.3.0
pathtools                            0.1.2
patsy                                0.5.1
pep8                                 1.7.1
pexpect                              4.8.0
pickleshare                          0.7.5
Pillow                               9.3.0
pip                                  22.3.1
pkginfo                              1.5.0.1
platformdirs                         2.6.0
plotly                               5.8.2
pluggy                               0.13.1
ply                                  3.11
pox                                  0.3.2
ppft                                 1.7.6.6
prometheus-client                    0.7.1
prompt-toolkit                       3.0.3
protobuf                             3.19.6
protobuf3-to-dict                    0.1.5
psutil                               5.6.7
ptyprocess                           0.6.0
pure-sasl                            0.6.2
py                                   1.11.0
pyarrow                              10.0.1
pyasn1                               0.4.8
pyasn1-modules                       0.2.8
pycodestyle                          2.5.0
pycosat                              0.6.3
pycparser                            2.19
pycrypto                             2.6.1
pycurl                               7.43.0.5
pydocstyle                           4.0.1
pyflakes                             2.1.1
pyfunctional                         1.4.3
Pygments                             2.13.0
PyHive                               0.6.5
pykerberos                           1.2.1
pylint                               2.15.8
pyodbc                               4.0.0-unsupported
pyOpenSSL                            22.1.0
pyparsing                            2.4.6
pyrsistent                           0.15.7
PySocks                              1.7.1
pytest                               5.3.5
pytest-arraydiff                     0.3
pytest-astropy                       0.8.0
pytest-astropy-header                0.1.2
pytest-doctestplus                   0.5.0
pytest-openfiles                     0.4.0
pytest-remotedata                    0.3.2
python-dateutil                      2.8.2
python-dotenv                        0.21.0
python-jsonrpc-server                0.3.4
python-language-server               0.31.7
pytz                                 2019.3
PyWavelets                           1.1.1
pyxdg                                0.26
PyYAML                               6.0
pyzmq                                24.0.1
QDarkStyle                           2.8
QtAwesome                            0.6.1
qtconsole                            4.6.0
QtPy                                 1.9.0
regex                                2022.10.31
requests                             2.28.1
requests-kerberos                    0.12.0
requests-oauthlib                    1.3.1
retrying                             1.3.4
rope                                 0.16.0
rsa                                  4.9
Rtree                                0.9.3
ruamel_yaml                          0.15.87
s3fs                                 0.4.2
s3transfer                           0.6.0
sagemaker                            2.125.0
sagemaker-data-insights              0.3.3
sagemaker-datawrangler               0.3.8
sagemaker-scikit-learn-extension     2.5.0
sagemaker-studio-analytics-extension 0.0.14
sagemaker-studio-sparkmagic-lib      0.1.4
sasl                                 0.2.1
schema                               0.7.5
scikit-image                         0.16.2
scikit-learn                         0.22.1
scipy                                1.4.1
seaborn                              0.10.0
SecretStorage                        3.1.2
Send2Trash                           1.8.0
setuptools                           59.3.0
simplegeneric                        0.8.1
singledispatch                       3.4.0.3
six                                  1.14.0
smclarify                            0.3
smdebug-rulesconfig                  1.0.1
sniffio                              1.3.0
snowballstemmer                      2.0.0
sortedcollections                    1.1.2
sortedcontainers                     2.1.0
soupsieve                            1.9.5
sparkmagic                           0.20.0
Sphinx                               2.4.0
sphinxcontrib-applehelp              1.0.1
sphinxcontrib-devhelp                1.0.1
sphinxcontrib-htmlhelp               1.0.2
sphinxcontrib-jsmath                 1.0.1
sphinxcontrib-qthelp                 1.0.2
sphinxcontrib-serializinghtml        1.1.3
sphinxcontrib-websupport             1.2.0
spyder                               4.0.1
spyder-kernels                       1.8.1
SQLAlchemy                           1.3.13
statsmodels                          0.11.0
sympy                                1.5.1
tables                               3.6.1
tabulate                             0.9.0
tblib                                1.6.0
tenacity                             8.1.0
tensorboard                          2.11.0
tensorboard-data-server              0.6.1
tensorboard-plugin-wit               1.8.1
tensorflow                           2.11.0
tensorflow-estimator                 2.11.0
tensorflow-io-gcs-filesystem         0.29.0
termcolor                            2.1.1
terminado                            0.8.3
testpath                             0.4.4
texttable                            1.6.7
thrift                               0.13.0
thrift-sasl                          0.4.3
tinycss2                             1.2.1
toml                                 0.10.2
tomli                                2.0.1
tomlkit                              0.11.6
toolz                                0.10.0
tornado                              6.2
tqdm                                 4.42.1
traitlets                            5.6.0
typed-ast                            1.5.4
typing_extensions                    4.4.0
ujson                                5.6.0
unicodecsv                           0.14.1
urllib3                              1.26.13
watchdog                             0.10.2
wcwidth                              0.1.8
webencodings                         0.5.1
websocket-client                     0.59.0
Werkzeug                             2.2.2
wheel                                0.34.2
widgetsnbextension                   3.5.1
wrapt                                1.11.2
wurlitzer                            2.0.0
xlrd                                 1.2.0
XlsxWriter                           1.2.7
xlwt                                 1.3.0
yapf                                 0.28.0
yarl                                 1.8.2
zict                                 1.0.0
zipp                                 3.11.0

acere avatar Dec 20 '22 10:12 acere

I'm seeing the same thing in us-east-2 with the SKLearn, TensorFlow, and XGBoost estimators as well

brianloyal avatar Dec 21 '22 16:12 brianloyal

Is this happening only in studio or for other jobs? This commit: https://github.com/aws/sagemaker-python-sdk/commit/019d5a4b232cd4d287dff35c6a8ba9681ed4c0ca added disable_profiler flag and botocore v1.29.33 seems to have this flag available as well

Roshrini avatar Dec 21 '22 23:12 Roshrini

@acere Can you recreate new user and try again?

Roshrini avatar Dec 22 '22 21:12 Roshrini

I got the same error message. Downgrade sagemaker to version 2.123.0 with the following command solved my problem: pip install sagemaker==2.123.0

tongliang11 avatar Dec 23 '22 21:12 tongliang11

@acere are you still experiencing this issue? Running that notebook on Studio (Python 3 (Data Science), us-east-2) with sagemaker 2.128.0 right now, I am able to run all cells with no issue.

claytonparnell avatar Jan 17 '23 22:01 claytonparnell

@claytonparnell the problem is still there on older (created before Dec 2022) SM Studio users. There isn't any issue with Studio users created after Dec 22 with any version of PySDK > 2.123.0

acere avatar Jan 18 '23 01:01 acere

Ok, so solution would be to create a new sagemaker studio user now (after Dec 2022)?

boriside avatar Jan 30 '23 20:01 boriside

This is solved by using sagemaker==2.123.0, are there plans to fix this in newer versions?

orangewise avatar Mar 08 '23 16:03 orangewise

PyTorch 1.13 and py39 are not available in 2.123. Is there an ETA for getting this fixed?

Rizhiy avatar Mar 26 '23 18:03 Rizhiy

Creating a new user in the domain and then using sagemaker==2.143.0 worked for me.

bengruher avatar Mar 30 '23 17:03 bengruher

I tried the same notebook on the same instance and did not have the issue.

I believe the issue is fixed on the latest Data Science image. Please try to shut down the kernel (from the top menu -> open Kernel -> Shut Down) and try again.

adimux avatar Aug 18 '23 20:08 adimux

"Missing required parameter in ProfilerConfig: "S3OutputPath" Unknown parameter in ProfilerConfig: "DisableProfiler", must be one of: S3OutputPath, ProfilingIntervalInMilliseconds, ProfilingParameters"

I have this issue tried all of the suggestions above but none fix the issue!

aebulut avatar Feb 01 '24 01:02 aebulut