java
java copied to clipboard
Error when using tensorflow-text on tensorflow-core
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04 x86_64): macOS Sonoma 14.2.1
- TensorFlow installed from (source or binary):
- TensorFlow python version (use command below): 2.11.0
- TensorFlow java version (use command below): 0.5.0
- Java version (i.e., the output of
java -version): 11.0.21.0.101 - Java command line flags (e.g., GC parameters):
- Python version (if transferring a model trained in Python):3.9.7
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version:N/A (CPU)
- GPU model and memory:N/A
Describe the current behavior
When we train a Keras model with a layer using tensorflow-text's BertTokenizer or FastBertTokenizer and load it in java using SavedModelBundle.load(dirPath, "serve") we get the following error -
org.tensorflow.exceptions.TensorFlowException: Converting GraphDef to Graph has failed with an error: 'Op type not registered 'CaseFoldUTF8' in binary running on <my machine>. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.' The binary trying to import the GraphDef was built when GraphDef version was 1205. The GraphDef was produced by a binary built when GraphDef version was 1286. The difference between these versions is larger than TensorFlow's forward compatibility guarantee, and might be the root cause for failing to import the GraphDef.
FWIW, the issue is not because of the GraphDef being a different version. I tried with multiple combos of tensorflow and tensorflow-core - (2.9.3, 0.4.2), (2.10.1, 0.5.0), and the error remains the same.
Describe the expected behavior
A way to register CaseFoldUTF8 (and possibly other tf-text ops) when loading savedmodel in java. Or support for tensorflow-text out of the box in tensorflow java.
You can load in tensorflow-text's native library using https://github.com/tensorflow/java/blob/master/tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/TensorFlow.java#L99. I don't recall if it works in the current version on main, but we have a test for it in the bazelcism branch based on TF 2.15.
@Craigacp I tried this based on the test you pointed to
val libname = System.mapLibraryName("_normalize_ops").substring(3)
val customOpLibrary = Paths.get(libname).toFile
val opList = TensorFlow.loadLibrary(customOpLibrary.getAbsolutePath)
But I get the following error
java.lang.UnsatisfiedLinkError:
dlopen(<masked out for privacy>/_normalize_ops.dylib, 0x0006): Symbol not found: __ZN10tensorflow15TensorShapeBaseINS_11TensorShapeEEC2EN4absl12lts_202111024SpanIKxEE
Referenced from: <E6640FFE-A3EA-388C-80D2-BB438DF363C2> <masked out for privacy>/_normalize_ops.dylib
Expected in: <F65DBC71-D914-3EAC-8A96-F72A30DB559C> /Users/ashish.srinivasa/.javacpp/cache/tensorflow-core-api-0.5.0-macosx-x86_64.jar/org/tensorflow/internal/c_api/macosx-x86_64/libtensorflow_framework.2.dylib
The link points to an unzipped .whl built for macos that I downloaded from PyPI.
Yeah, that's the right approach, but you're hitting the issue where we don't compile in C++ 11 mode and base TensorFlow does now. If you try and compile the bazelcism branch that should work, as in that branch we pull down the binaries from TF Python and compile against those, but in the 0.5.0 release it's not going to work.
@Craigacp Any chance of releasing new Tensorflow Java version package (org.tensorflow:tensorflow-core-api) which depends on Tensorflow 2.14.0? We encountered the similar issue. We have a model that depends on FastBertNomalize and got error Converting GraphDef to Graph has failed with an error: 'Op type not registered 'FastBertNormalize' in binary running , so we want to load the tensorflow_text .so files and we were testing the Tensorflow.loadLibrary(tfTextFile), under the version combination of:
- org.tensorflow:tensorflow-core-api:0.4.0 and tensorflow_text 2.7.0
- org.tensorflow:tensorflow-core-api:0.4.2 and tensorflow_text 2.7.3
- org.tensorflow:tensorflow-core-api:0.5.0 and tensorflow_text 2.9.0
- org.tensorflow:tensorflow-core-api:0.5.0 and tensorflow_text 2.10.0
But none of them worked. The most promising one is the first one, which is under
org.tensorflow:tensorflow-core-api:0.4.0, and some of the.sofiles were loaded successfully, but some of them were not, like this:
The error is
undefined symbol: _ZNK10tensorflow8OpKernel11TraceStringB5cxx11ERKNS_15OpKernelContextEb
So apparently in this case, _fast_bert_nomalizer.so is not successfully imported so the model can not be loaded.
We got the .so files from the ones under the folder of python/ops when unzipping tensorflow-text .whl files
We're releasing TF-Java 1.0.0-rc1 based on TF 2.16.1 this week. It has a test for loading in tensorflow-text ops as part of our CI now so it should work.
1.0.0-rc1 is out on Maven Central - https://central.sonatype.com/artifact/org.tensorflow/tensorflow-core-native
Thanks for releasing that. Confirmed that it worked on our dev environment. This unblocked our project and saved us tons of time!
@lastmansleeping , if you have any chance to test it as well and close this issue, that would be awesome, thanks!