tf-keras icon indicating copy to clipboard operation
tf-keras copied to clipboard

Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT

Open sky712345678 opened this issue 3 years ago • 21 comments

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.9
  • Python version: 3.7
  • Bazel version (if compiling from source): N/A
  • GPU model and memory: N/A (CPU)
  • Exact command to reproduce:
  1. Open the notebook with Google Colab
  2. Run all cells
  3. View the runtime logs image Note: We have to upgrade Tensorflow and Keras to 2.9 manually in the notebook, because the current default version is not the latest one on Colab.

Describe the problem. (Continue the issue from tensorflow_issue_57052) I got a Type inference failed error when running tf.keras.Model.fit() in Tensorflow 2.9 and Keras 2.9. I didn't see this kind of error in version 2.8 with the identical code. Although the program didn't crash, I'm afraid that there will be some error in the trained model.

Describe the current behavior. Run tf.keras.Model.fit() and the error Type inference failed shows up.

Describe the expected behavior. The error shouldn't show up.

Contributing.

  • Do you want to contribute a PR? (yes/no): no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue. Link to notebook: https://drive.google.com/file/d/1k78lpGVthB7nthEkYgUs3JNJTuR79r5E/view?usp=sharing To reproduce:

  1. Open the notebook with Google Colab
  2. Run all cells
  3. View the runtime logs

Source code / logs.

2022-08-20 17:18:05.533157: W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_BOOL
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_LEGACY_VARIANT
    }
  }
}

        while inferring type of node 'dice_loss/cond/output/_11'

sky712345678 avatar Sep 02 '22 07:09 sky712345678

@gadagashwini I was able to replicate the issue on colab, please find the gist here. Thank you!

sushreebarsa avatar Sep 06 '22 07:09 sushreebarsa

Hi @sky712345678, W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed. is just a warning, you can safely ignore it. Given code executed without any error message. Thank you!

gadagashwini avatar Sep 14 '22 08:09 gadagashwini

@gadagashwini what's the point of a warning if the response is simply you can safely ignore it.? It's clearly there for a reason

tgsmith61591 avatar Sep 20 '22 17:09 tgsmith61591

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] avatar Sep 27 '22 18:09 google-ml-butler[bot]

@gadagashwini can you talk a little bit more about the reason why we can safely ignore it? Thank you!

sky712345678 avatar Sep 28 '22 02:09 sky712345678

@sky712345678 This looks like an issue from tensorflow. Can you please create this issue in tensorflow/tensorflow. Thank you!!

gowthamkpr avatar Sep 29 '22 17:09 gowthamkpr

@gowthamkpr Well, the problem was first reported in tensorflow as https://github.com/tensorflow/tensorflow/issues/57052, but the guys there told the reporter to instead post an issue here.

If you know any more details (why it is a TensorFlow issue), could you please provide more details that we can give to the TF guys?

foxik avatar Oct 02 '22 08:10 foxik

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

google-ml-butler[bot] avatar Oct 09 '22 09:10 google-ml-butler[bot]

@gowthamkpr The issue was originally a TF issue, but we were redirected to post it here. If you know any more details (why it is a TF issue and not Keras), could you please provide more details that we can give to the TF guys? Thanks!

foxik avatar Oct 09 '22 16:10 foxik

Unsubscribe

---Original--- From: "Milan @.> Date: Sun, Oct 2, 2022 16:54 PM To: @.>; Cc: @.***>; Subject: Re: [keras-team/keras] Type inference failed. This indicates aninvalid graph that escaped type checking. Error message: INVALID_ARGUMENT(Issue keras-team/tf-keras#103)

@gowthamkpr Well, the problem was first reported in tensorflow as tensorflow/tensorflow#57052, but the guys there told the reporter to instead post an issue here.

If you know any more details (why it is a TensorFlow issue), could you please provide more details that we can give to the TF guys?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

zrx563010758 avatar Oct 11 '22 08:10 zrx563010758

The issue is at the level of the dice_loss. Can you try producing a reproduction script that only involves the loss function? Maybe just try to backprop through the loss function and see what happens.

I think this should be reproducible without involving any Keras logic, at which point the TF folks will definitely look at it. But anyway, as said before, this is just a warning, not something critical. You can ignore it.

fchollet avatar Oct 13 '22 17:10 fchollet

Ok, I got it. Thank you!

I wasn't sure how to reproduce it only involving the loss function, this is my try: https://colab.research.google.com/drive/1qxamrOaOqfVANzMnN-u--Sue4iPtJCtf?usp=sharing image Running this Colab notebook, I didn't see the error message in runtime logs.

sky712345678 avatar Oct 19 '22 06:10 sky712345678

Ok, I got it. Thank you!

I wasn't sure how to reproduce it only involving the loss function, this is my try: https://colab.research.google.com/drive/1qxamrOaOqfVANzMnN-u--Sue4iPtJCtf?usp=sharing image Running this Colab notebook, I didn't see the error message in runtime logs.

Hi bro, Can you share the solution to over this issue? I have met the similar problem. Thank you so much

TuanHAnhVN avatar Dec 19 '22 15:12 TuanHAnhVN

I'm getting the same warning with TF 2.11 when I set mask_zero=True in the embedding layer.

isohrab avatar Dec 29 '22 12:12 isohrab

+1; I'm also getting the same warning with TF 2.11 and setting mask_zero=True in the embedding layer while training on GPU. After the warning, the model keeps training and is then saved, but the saved model can't be loaded using keras.models.load_model. However, when I'm training on CPU (even with mask_zero=True) everything works fine and the warning doesn't show up; the model is trained, saved and can be loaded and used again without encountering any problem.

alibahmanyar avatar Jan 05 '23 08:01 alibahmanyar

I'm getting something very similar but with pure TF 2.11 on Mac M1. So I really think this is a pure TF issue, and we should reopen the TF issue (https://github.com/tensorflow/tensorflow/issues/57052).

albertz avatar Feb 17 '23 09:02 albertz

I have the same issue unfortunately. Currently running with mask_zero set to True and using CPU without issue.

NinaCilliers avatar Aug 21 '23 00:08 NinaCilliers

Hi @sky712345678, W tensorflow/core/common_runtime/forward_type_inference.cc:231] Type inference failed. is just a warning, you can safely ignore it. Given code executed without any error message. Thank you!

Nope - because the execution time 8-folds!

cromicron avatar Oct 06 '23 08:10 cromicron

Hi, on TF 2.13.0 I get this warning as well when training a simple encoder-decoder EN-ES translation with LSTM accepting embedded strings with mask_zero=True:

2023-10-07 11:39:56.995271: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT32 } } } is neither a subtype nor a supertype of the combined inputs preceding it: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_FLOAT } } }

for Tuple type infernce function 0
while inferring type of node 'cond_40/output/_23'

The model trains, but then when I wanted to use jit_compile=True the fit() breaks with:

2023-10-07 11:46:15.327751: W tensorflow/core/framework/op_kernel.cc:1828] OP_REQUIRES failed at xla_ops.cc:444 : INVALID_ARGUMENT: Trying to access resource 7590 located in device /job:localhost/replica:0/task:0/device:CPU:0 from device /job:localhost/replica:0/task:0/device:GPU:0

Someone on StackExchange suggested that this JIT failure might be linked to TF code creating something as INT32 instead of FLOAT32 and resulting in putting some variables in CPU, which seems to be linked to the above motioned error.

iconrnd avatar Oct 07 '23 09:10 iconrnd

Still getting this error has there been any update?

Epoch 1/20
2024-06-18 14:04:10.665333: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT32
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_FLOAT
    }
  }
}

	while inferring type of node 'cond_42/output/_24'
2024-06-18 14:04:10.835800: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:630] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.

Frank-Schiro avatar Jun 18 '24 18:06 Frank-Schiro

I have solved a similar error in my code and here's how I did it.

I think the problem lies when using @tf.function or using any function with a condition while running tf graph. In my case during model.fit method.

Problem indicates that invalid graph escaped type checking. When using if-else statement in @tf.function code keras API converts if-else conditions into tf.cond (AutoGraph converts if-statement to tf.cond().) however, during model.fit() tensorflow gives a warning when using elif but if you want to avoid that error remove elif statements with normal if-else statements and I think that might solve this problem.

Implementation of function before error and it was used in loss function which was used in mode.compile and later model.fit method

import tensorflow as tf
class RescaleImage():
    def __init__(self) -> None:
        super().__init__()
    
    @tf.function
    def normalize(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
        min_val = tf.cast(min_val,tf.float32)
        max_val = tf.cast(max_val, tf.float32)
        if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
            if min_val==0.0 and max_val==1.0:
                x = x/255.0
            elif min_val==-1.0 and max_val==1.0:
                x = (x - 127.5)/127.5
        elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
            if min_val==0.0 and max_val==1.0:
                x = (x+1.0)/2.0
            elif min_val==0.0 and max_val==255.0:
                x = (x+1.0)*255.0/2.0

        elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
            if min_val==-1.0 and max_val==1.0:
                x = (x-0.5)/0.5
            elif min_val==0.0 and max_val==255.0:
                x = x*255.0
        return x
    
    @tf.function
    def normalize_individual(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
        min_val = tf.cast(min_val,tf.float32)
        max_val = tf.cast(max_val, tf.float32)
        if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
            factor = (max_val-min_val)/(tf.math.reduce_max(x)-tf.math.reduce_min(x))
            x = factor*(x - tf.math.reduce_min(x))+min_val
            
        elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
            if min_val==0.0 and max_val==1.0:
                x = (x+1.0)/2.0
            elif min_val==0.0 and max_val==255.0:
                x = (x+1.0)*255.0/2.0

        elif tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
            if min_val==-1.0 and max_val==1.0:
                x = (x-0.5)/0.5
            elif min_val==0.0 and max_val==255.0:
                x = x*255.0
        return x

Code after solving the error (using normal if statements):

import tensorflow as tf

class RescaleImage():
    def __init__(self) -> None:
        super().__init__()
    
    @tf.function
    def normalize(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
        min_val = tf.cast(min_val,tf.float32)
        max_val = tf.cast(max_val, tf.float32)
        if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
            if min_val==0.0 and max_val==1.0:
                x = x/255.0
            if min_val==-1.0 and max_val==1.0:
                x = (x - 127.5)/127.5

        if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
            if min_val==0.0 and max_val==1.0:
                x = (x+1.0)/2.0
            if min_val==0.0 and max_val==255.0:
                x = (x+1.0)*255.0/2.0

        if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
            if min_val==-1.0 and max_val==1.0:
                x = (x-0.5)/0.5
            if min_val==0.0 and max_val==255.0:
                x = x*255.0

        return x
    
    @tf.function
    def normalize_individual(self, x:tf.Tensor, min_val: float=0.0, max_val: float=1.0)->tf.Tensor:
        min_val = tf.cast(min_val,tf.float32)
        max_val = tf.cast(max_val, tf.float32)
        if tf.reduce_max(x)>1.0 and tf.reduce_min(x)>=0.0:
            factor = (max_val-min_val)/(tf.math.reduce_max(x)-tf.math.reduce_min(x))
            x = factor*(x - tf.math.reduce_min(x))+min_val
            
        if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=-1.0 and tf.reduce_min(x)<0.0:
            if min_val==0.0 and max_val==1.0:
                x = (x+1.0)/2.0
            if min_val==0.0 and max_val==255.0:
                x = (x+1.0)*255.0/2.0

        if tf.reduce_max(x)<=1.0 and tf.reduce_min(x)>=0.0:
            if min_val==-1.0 and max_val==1.0:
                x = (x-0.5)/0.5
            if min_val==0.0 and max_val==255.0:
                x = x*255.0

        return x

neevmanvar avatar Jul 31 '24 15:07 neevmanvar