doctr Inconsistent references when loading the checkpoint

Bug description

Hi,

I want to use the recognition_predictor but I get some errors when loading it. I tested my code on a Colaboratory notebook so the environnement is clean. Here is my code :

!pip install "python-doctr[tf]"

and then

from doctr.models import recognition_predictor
model = recognition_predictor('sar_resnet31', pretrained=True)

but I get :

WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<keras.layers.normalization.batch_normalization.BatchNormalization object at 0x7f3fe0e70dd0> and <keras.layers.pooling.MaxPooling2D object at 0x7f3fe0e77510>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e96650> and <doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e236d0>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e236d0> and <doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e32710>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e32710> and <keras.layers.convolutional.Conv2D object at 0x7f3fe0e41790>).

I tried also with the master model, but I get the same kind of warnings.

Code snippet to reproduce the bug

from doctr.models import recognition_predictor
model = recognition_predictor('sar_resnet31', pretrained=True)

Error traceback

WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<keras.layers.normalization.batch_normalization.BatchNormalization object at 0x7f3fe0e70dd0> and <keras.layers.pooling.MaxPooling2D object at 0x7f3fe0e77510>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e96650> and <doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e236d0>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e236d0> and <doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e32710>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<doctr.models.classification.resnet.tensorflow.ResnetBlock object at 0x7f3fe0e32710> and <keras.layers.convolutional.Conv2D object at 0x7f3fe0e41790>).

Environment

DocTR version: 0.5.0
TensorFlow version: 2.6.3
PyTorch version: 1.10.0+cu111 (torchvision 0.11.1+cu111)
OpenCV version: 4.1.2
OS: Ubuntu 18.04.5 LTS
Python version: 3.7.12
Is CUDA available (TensorFlow): No
Is CUDA available (PyTorch): No
CUDA runtime version: 11.1.105
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5

Feb 16 '22 09:02 LCMA7

Hi @LCMA7 :wave:

Those are only warnings, and we do experience that on a few models. It's hard to get to the bottom of this, as PyTorch doesn't operate the same way. In a nutshell, TensorFlow does Lazy initialization by default (meaning the kernel are not initialized when you instantiate the model, but after the first forward pass)

So when we load the checkpoint into the model, if some kernels are not yet initialized, it throws those warnings. But we have to clean the SAR & MASTER architecture implementations soon anyway, so I'll check if there is any more troublesome causes behind this :+1:

Feb 21 '22 10:02 fg-mindee

#925 will fix this for SAR :)

May 24 '22 19:05 felixdittrich92

@felixdittrich92 are you positive? because that's a frequent warning for TF when all the layers are not built exactly as it expects it before loading a checkpoint 😅

May 25 '22 19:05 frgfm

@felixdittrich92 are you positive? because that's a frequent warning for TF when all the layers are not built exactly as it expects it before loading a checkpoint sweat_smile

@frgfm it will fix the inconsistent results not the warnings :sweat_smile: i have tested my PR several times and was not able to see any inconsitent results

May 25 '22 20:05 felixdittrich92

@frgfm wdyt can we close this ? There are only warnings left (models works fine) which we can't avoid i think

Sep 15 '22 18:09 felixdittrich92

I agree :+1:

Sep 16 '22 10:09 frgfm

doctr doctr copied to clipboard

Inconsistent references when loading the checkpoint

Bug description

Code snippet to reproduce the bug

Error traceback

Environment

doctr
doctr copied to clipboard