java icon indicating copy to clipboard operation
java copied to clipboard

TF-Java SavedModels can't be loaded into TF Python

Open Craigacp opened this issue 4 years ago • 5 comments
trafficstars

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04 x86_64): macOS 11
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version (use command below): 0.3.1 - 0.4.0-SNAPSHOT, loading into TF 2.6.0.
  • Java version (i.e., the output of java -version): Java 11
  • Java command line flags (e.g., GC parameters): n/a
  • Python version (if transferring a model trained in Python): Python 3.8

Describe the current behavior Saving a model from Java generates a model which the saved_model_cli thinks is valid, but cannot be loaded into Python.

Loading it into Python gives the following stack trace:

>>> model = tf.saved_model.load("./tf-cnn-mnist-model")
2021-08-24 11:45:34.330161: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 864, in load
    result = load_internal(export_dir, tags, options)["root"]
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 922, in load_internal
    root = load_v1_in_v2.load(export_dir, tags)
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 286, in load
    return loader.load(tags=tags)
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 233, in load
    restore_from_saver = self._extract_saver_restore(wrapped, saver)
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 109, in _extract_saver_restore
    return wrapped.prune(
  File "./tf-2-py38/lib/python3.8/site-packages/tensorflow/python/eager/wrap_function.py", line 279, in prune
    raise ValueError("Feeds must be tensors.")
ValueError: Feeds must be tensors.

saved_model_cli reports:

$ saved_model_cli show --all --dir ./tf-cnn-mnist-model/

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['MNIST_INPUT'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 28, 28, 1)
        name: MNIST_INPUT:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['tribuo-internal_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: tribuo-internal_1:0
  Method name is: 

This particular model was saved out using Tribuo's save logic which performs the following operations:

        Signature.Builder sigBuilder = Signature.builder();
        Set<String> inputs = featureConverter.inputNamesSet();
        for (String s : inputs) {
            Operation inputOp = modelGraph.operation(s);
            sigBuilder.input(s, inputOp.output(0));
        }
        Operation outputOp = modelGraph.operation(outputName);
        Signature modelSig = sigBuilder.output(outputName, outputOp.output(0)).build();
        ConcreteFunction concFunc = ConcreteFunction.create(modelSig, session);
        SavedModelBundle.exporter(path).withFunction(concFunc).export();

The initial report is here - https://discuss.tensorflow.org/t/valueerror-feeds-must-be-tensors/3915.

Describe the expected behavior The saved model should load in without error.

Craigacp avatar Aug 24 '21 16:08 Craigacp

It could be interesting to see if the name of the save ops in the graph (i.e. those under the save/ subscope) are matching the ones provided in the SaverDef proto coming with the saved model bundle, @Craigacp can you try to output this?

karllessard avatar Aug 24 '21 17:08 karllessard

Also what I find obscure is that it looks like the Python code is trying to load the model in a way that it could be executed eagerly.

Can you try disable default eager execution prior to load the model (probably by calling tf.compat.v1.disable_eager_execution())?

karllessard avatar Aug 24 '21 17:08 karllessard

One more suggestion (sorry I can't try them out as I don't have access to my laptop right now) but ot looks like Python is expecting the filename tensor name provided in the SaverDef to use a tensor notation, i.e. save/filename:0 instead of save/filename. Can you update the TF Java code to use this notation and see if that works?

karllessard avatar Aug 24 '21 18:08 karllessard

I was now able to do a quick test and it seems that my last suggestion is working, I'll work on a patch

karllessard avatar Aug 25 '21 02:08 karllessard

This should be fixed in 0.3.3 which is in the process of being released. Once that's done we'll get the fix merged into master as well.

Craigacp avatar Aug 30 '21 21:08 Craigacp