tfx
tfx copied to clipboard
Unable to save tf transformations that use operations from tensorflow_addons
If the bug is related to a specific library below, please raise an issue in the respective repo directly:
TensorFlow Data Validation Repo
TensorFlow Model Analysis Repo
System information
- Have I specified the code to reproduce the issue (Yes, No):
- Environment in which the code is executed (e.g., Local(Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): MacOS
- TensorFlow version: 2.6.0
- TFX Version: 1.3.0
- Python version: 3.8
- Python dependencies (from
pip freeze
output):
absl-py==0.11.0
aiobotocore==1.4.2
aiohttp==3.7.4.post0
aioitertools==0.8.0
alembic==1.7.4
apache-beam==2.33.0
appnope==0.1.2
argon2-cffi==21.1.0
astunparse==1.6.3
async-timeout==3.0.1
attrs==20.3.0
autopage==0.4.0
avro-python3==1.9.2.1
backcall==0.2.0
bleach==4.1.0
botocore==1.20.106
cachetools==4.2.4
certifi==2021.10.8
cffi==1.15.0
chardet==4.0.0
charset-normalizer==2.0.7
clang==5.0
click==7.1.2
cliff==3.9.0
cloudpickle==2.0.0
cmaes==0.8.2
cmd2==2.2.0
colorama==0.4.4
colorlog==6.5.0
crcmod==1.7
cryptography==35.0.0
cycler==0.10.0
datadog==0.42.0
ddtrace==0.54.1
debugpy==1.5.0
decorator==5.1.0
defusedxml==0.7.1
Deprecated==1.2.13
dill==0.3.1.1
docker==4.4.4
docopt==0.6.2
docstring-parser==0.11
docutils==0.17.1
entrypoints==0.3
fastavro==1.4.5
fasteners==0.16.3
fire==0.4.0
Flask==2.0.2
flask-accept==0.0.6
flask-expects-json==1.6.0
Flask-HTTPAuth==4.4.0
flatbuffers==1.12
fsspec==2021.10.0
future==0.18.2
gast==0.4.0
gcsfs==2021.10.0
google-api-core==1.31.3
google-api-python-client==1.12.8
google-apitools==0.5.31
google-auth==1.35.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-aiplatform==1.6.0
google-cloud-bigquery==2.28.1
google-cloud-bigquery-storage==2.9.1
google-cloud-bigtable==1.7.0
google-cloud-core==1.7.2
google-cloud-datastore==1.15.3
google-cloud-dlp==1.0.0
google-cloud-language==1.3.0
google-cloud-notebooks==1.1.0
google-cloud-pipeline-components==0.1.9
google-cloud-pubsub==1.7.0
google-cloud-recommendations-ai==0.2.0
google-cloud-spanner==1.19.1
google-cloud-storage==1.42.3
google-cloud-videointelligence==1.16.1
google-cloud-vision==1.0.0
google-crc32c==1.3.0
google-pasta==0.2.0
google-resumable-media==2.0.3
googleapis-common-protos==1.53.0
greenlet==1.1.2
grpc-google-iam-v1==0.12.3
grpcio==1.41.0
grpcio-gcp==0.2.2
h5py==3.1.0
hdfs==2.6.0
httplib2==0.19.1
idna==3.3
importlib-metadata==4.8.1
importlib-resources==5.2.2
install==1.3.4
ipykernel==6.4.1
ipython==7.28.0
ipython-genutils==0.2.0
ipywidgets==7.6.5
itsdangerous==2.0.1
jedi==0.18.0
Jinja2==3.0.2
jmespath==0.10.0
joblib==0.14.1
JSON-log-formatter==0.4.0
jsonschema==3.2.0
jupyter-client==7.0.6
jupyter-core==4.8.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.2
keras==2.6.0
Keras-Preprocessing==1.1.2
keras-tuner==1.0.4
keyring==23.2.1
keyrings.google-artifactregistry-auth==0.0.3
kfp==1.8.7
kfp-pipeline-spec==0.1.13
kfp-server-api==1.7.0
kiwisolver==1.3.2
kt-legacy==1.0.4
kubernetes==12.0.1
libcst==0.3.21
Mako==1.1.5
Markdown==3.3.4
MarkupSafe==2.0.1
matplotlib==3.4.3
matplotlib-inline==0.1.3
mistune==0.8.4
ml-metadata==1.3.0
ml-pipelines-sdk==1.3.0
multidict==5.2.0
mypy-extensions==0.4.3
nbclient==0.5.4
nbconvert==6.2.0
nbformat==5.1.3
nest-asyncio==1.5.1
notebook==6.4.4
numpy==1.19.5
oauth2client==4.1.3
oauthlib==3.1.1
opt-einsum==3.3.0
optuna==2.10.0
orjson==3.6.4
packaging==20.9
pandas==1.3.3
pandas-gbq==0.15.0
pandocfilters==1.5.0
parso==0.8.2
patsy==0.5.2
pbr==5.6.0
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.3.2
pkginfo==1.7.1
pluggy==1.0.0
portpicker==1.4.0
prettytable==2.2.1
prometheus-client==0.11.0
promise==2.3
prompt-toolkit==3.0.20
proto-plus==1.19.5
protobuf==3.17.3
ptyprocess==0.7.0
pyarrow==5.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
pydantic==1.8.2
pydata-google-auth==1.2.0
pydot==1.4.2
Pygments==2.10.0
PyJWT==2.2.0
pymongo==3.12.0
pyparsing==2.4.7
pyperclip==1.8.2
pyrsistent==0.18.0
python-dateutil==2.8.2
python-snappy==0.6.0
pytz==2021.3
PyYAML==5.4.1
pyzmq==22.3.0
readme-renderer==30.0
requests==2.26.0
requests-oauthlib==1.3.0
requests-toolbelt==0.9.1
rfc3986==1.5.0
rsa==4.7.2
s3fs==2021.10.0
scikit-learn==0.24.2
scipy==1.7.1
seaborn==0.11.2
Send2Trash==1.8.0
six==1.15.0
SQLAlchemy==1.4.25
statsmodels==0.13.0
stevedore==3.4.0
strip-hints==0.1.10
tabulate==0.8.9
tenacity==8.0.1
tensorboard==2.7.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.6.0
tensorflow-addons==0.14.0
tensorflow-data-validation==1.3.0
tensorflow-estimator==2.6.0
tensorflow-hub==0.12.0
tensorflow-metadata==1.2.0
tensorflow-model-analysis==0.34.1
tensorflow-serving-api==2.6.0
tensorflow-transform==1.3.0
termcolor==1.1.0
terminado==0.12.1
testpath==0.5.0
tfds-nightly==4.4.0.dev202110150107
tfx==1.3.0
tfx-bsl==1.3.0
threadpoolctl==3.0.0
tornado==6.1
tqdm==4.62.3
traitlets==5.1.0
twine==3.4.2
typeguard==2.13.0
typer==0.4.0
typing-extensions==3.7.4.3
typing-inspect==0.7.1
uritemplate==3.0.1
urllib3==1.26.7
waitress==2.0.0
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.2.1
Werkzeug==2.0.2
widgetsnbextension==3.5.1
wrapt==1.12.1
xgboost==1.4.2
yarl==1.7.0
zipp==3.6.0
Describe the current behavior
Trying to use tensorflow_addons inside of transformation causes TFX pipeline run failure with error:
ValueError: Attempted to save ops from non-whitelisted namespaces to SavedModel: ['Addons>ParseTime']
Describe the expected behavior
Any tensorflow_addons operation used in preprocessing_fn
function can be correctly serialized when transformation "model" is persisted.
Standalone code to reproduce the issue
Steps:
1 - Create a basic TFX pipeline that ingest some CVS file having one of the columns a string timestamp field
2 - Create transformation python module containing preprocessing_fn
def and try to use a tensorflow addon library to read timestamp column and transform it in numeric value. E.g.:
my_transformation_module.py
import tensorflow_addons as tfa
...
def preprocessing_fn(inputs):
outputs['my_timestamp_numeric_col']= tfa.text.parse_time(inputs['my_csv_timestamp_column_name'], '"%Y-%m-%d %H:%M:%S"', "MILLISECOND")
3 - in python training module use transformed column
def _build_estimator(config, hidden_units=None, warm_start_from=None):
...
real_valued_columns += [
tf.feature_column.numeric_column('my_timestamp_numeric_col', shape=())
]
4 - run pipeline
Providing a bare minimum test case or step(s) to reproduce the problem will greatly help us to debug the issue. If possible, please share a link to Colab/Jupyter/any notebook.
Name of your Organization (Optional)
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
The failure stack trace is:
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/apache_beam/transforms/core.py", line 1560, in <lambda>
wrapper = lambda x, *args, **kwargs: [fn(x, *args, **kwargs)]
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow_transform/beam/impl.py", line 646, in _create_v2_saved_model
impl_helper.trace_and_write_v2_saved_model(saved_model_dir, preprocessing_fn,
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow_transform/impl_helper.py", line 799, in trace_and_write_v2_saved_model
concrete_transform_fn = _trace_and_write_transform_fn(
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow_transform/impl_helper.py", line 738, in _trace_and_write_transform_fn
return saved_transform_io_v2.write_v2_saved_model(
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow_transform/saved/saved_transform_io_v2.py", line 528, in write_v2_saved_model
tf.saved_model.save(module, saved_model_dir)
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1193, in save
save_and_return_nodes(obj, export_dir, signatures, options)
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1228, in save_and_return_nodes
_build_meta_graph(obj, signatures, options, meta_graph_def))
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1399, in _build_meta_graph
return _build_meta_graph_impl(obj, signatures, options, meta_graph_def)
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 1351, in _build_meta_graph_impl
asset_info, exported_graph = _fill_meta_graph_def(
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 878, in _fill_meta_graph_def
_verify_ops(graph_def, namespace_whitelist)
File "/opt/anaconda3/envs/msys-38-env/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py", line 914, in _verify_ops
raise ValueError(
ValueError: Attempted to save ops from non-whitelisted namespaces to SavedModel: ['Addons>ParseTime'].
Please verify that these ops should be saved, since they must be available when loading the SavedModel. If loading from Python, you must import the library defining these ops. From C++, link the custom ops to the serving binary. Once you've confirmed this, please add the following namespaces to the `namespace_whitelist` argument in tf.saved_model.SaveOptions: {'Addons'}. [while running 'Analyze/CreateSavedModel[tf_v2_only]/CreateSavedModel']
As can be seen in saved_transform_io_v2.py
the tf.saved_model.save(module, saved_model_dir)
does not allow user to specify the custom tf.saved_model.SaveOptions
object and by this, user can not use the non-whitelisted addons.
I have the same problem. Any update?
This is an old thread. But I was able to save the model by tfa.register_all()
before saving to register all the add ons.
@calin-coan As mentioned by the above comment, can you please register all the addons before saving. Thanks!!!
Closing this issue as it has been inactive for a while. Please add additional comments to reopen this issue. Thanks!!