image-segmentation-keras Tensorflow 2.5.0 compatibility with the package

I am trying to make your powerful library with an RTX 3070 Ti. After creating a python 3.8 anaconda environment with CUDA 11.3.1 and cuDNN 8.2.1 The lowest tensorflow version that properly utilzes my RTX 3070 Ti is 2.5.0. The training runs properly but with one problem. Instead of creating the checkpoints as below:

mobilenet_segnet_no_aug.0 mobilenet_segnet_no_aug.1 mobilenet_segnet_no_aug.2 mobilenet_segnet_no_aug.3 mobilenet_segnet_no_aug.4 etc. mobilenet_segnet_no_aug_config.json

then the below checkpoints are produced:

mobilenet_segnet_no_aug.0.data-00000-of-00001 mobilenet_segnet_no_aug.0.index mobilenet_segnet_no_aug.1.data-00000-of-00001 mobilenet_segnet_no_aug.1.index mobilenet_segnet_no_aug.2.data-00000-of-00001 mobilenet_segnet_no_aug.2.index mobilenet_segnet_no_aug.3.data-00000-of-00001 mobilenet_segnet_no_aug.3.index mobilenet_segnet_no_aug.4.data-00000-of-00001 mobilenet_segnet_no_aug.4.index etc. mobilenet_segnet_no_aug_config.json

The problem is that after the training i cannot find a way to load all the new type checkpoints in order to use them for evaluating my training through the predict_multiple function.

P.S I use your framework without a problem but through a GTX 1070, CUDA 10.0 and Tensorflow 1.14.0, now we are trying to exploit the capabilities of a new generation GPU.

Mar 10 '22 14:03 sotomotocross

I have the same issue when trying to predict using the obtained checkpoints. The training was okay but cann't do any prediction. Did you solve this issue? Thanks.

Apr 19 '22 14:04 Yang-Yin

Unfortunately, i am still facing the same issue. Because it was a side project i continued working with my GTX 1070 and the CUDA 10.0 anaconda environment.

Apr 20 '22 08:04 sotomotocross

I have the issue the other way around. I still have the checkpoints in the first format from training it with tensorflow 2.2: mobilenet_segnet_no_aug.0 mobilenet_segnet_no_aug.1 mobilenet_segnet_no_aug.2 mobilenet_segnet_no_aug.3

But after updating to a more recent version it fails loading the checkpoints generated with the old version. I suspect that tensorflow may have changed the format for checkpoints, which might be the issue for us.

Sep 13 '23 13:09 mauricedoepke

I dug a bit deeper into this issue and my assumption of the different file formats being an issue here seems to be true.

Currently tensorflow supports 3 file formats: h5, tensorflow, keras. And the default format is the tensorflow one now. I suspect when this library was initially programmed, h5 was the default.

To fix this, this library needs to explicitely give the checkpoints the file ending ".h5" to make tensorflow aware of the file format. This needs to happen everywhere, where the methods. load_weights and save_weights are used. (train.py and predict.py) That should solve the issue.

Maybe someone can open a pr for this, I don't have the time to do this at the moment.

Sep 13 '23 15:09 mauricedoepke

image-segmentation-keras image-segmentation-keras copied to clipboard

Tensorflow 2.5.0 compatibility with the package

image-segmentation-keras
image-segmentation-keras copied to clipboard