Unable to use "mixed_float16" in Object detect API

Open tq3940 opened this issue 8 months ago • 3 comments

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[√] I am using the latest TensorFlow Model Garden release and TensorFlow 2.
[√] I am reporting the issue to the correct repository. (Model Garden official or research directory)
[√] I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/object_detection

2. Describe the bug

I'm trying to use "mixed_float16" to speed up my training on RTX 4090. Following the guide of official document of mixed_precision , I add the code: mixed_precision.set_global_policy('mixed_float16') in front of tf.compat.v1.app.run() in my train_tf2.py. However, the tensorflow reborted the following error:

        return _compute_losses_and_predictions_dicts(model, features, labels,
    File "/root/miniconda3/lib/python3.8/site-packages/object_detection/model_lib_v2.py", line 130, in _compute_losses_and_predictions_dicts  *
        losses_dict = model.loss(
    File "/root/miniconda3/lib/python3.8/site-packages/object_detection/meta_architectures/center_net_meta_arch.py", line 3967, in loss  *
        object_center_loss = self._compute_object_center_loss(
    File "/root/miniconda3/lib/python3.8/site-packages/object_detection/meta_architectures/center_net_meta_arch.py", line 3099, in _compute_object_center_loss  *
        loss += object_center_loss(
    File "/root/miniconda3/lib/python3.8/site-packages/object_detection/core/losses.py", line 94, in __call__  *
        return self._compute_loss(prediction_tensor, target_tensor, **params)
    File "/root/miniconda3/lib/python3.8/site-packages/object_detection/core/losses.py", line 855, in _compute_loss  *
        negative_loss = (tf.math.pow((1 - target_tensor), self._beta)*

    TypeError: Input 'y' of 'Mul' Op has type float16 that does not match type float32 of argument 'x'.

I also tried to add this code: tf.compat.v2.keras.mixed_precision.set_global_policy('mixed_float16') , which I modified on the basis of tf.compat.v2.keras.mixed_precision.set_global_policy('mixed_bfloat16') found in the file model_lib_v2.py

or add Environment variables by os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '1', which was suggesd in this answer

But all of my attempt have failed with above error. I want to know how to solve this issue.

3. Steps to reproduce

add the code: mixed_precision.set_global_policy('mixed_float16') in front of tf.compat.v1.app.run() in my train_tf2.py.

4. Expected behavior

The model can be trained in "mixed precision" mode.

5. Additional context

None

6. System information

OS Platform and Distribution : Linux Ubuntu 22.04
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 2.13.1
Python version: 3.8.10
CUDA/cuDNN version: 12.2(cuda) / 8.6.0.163(cudnn)
GPU model and memory: NVIDIA GeForce RTX 4090 24G

Jun 02 '24 07:06 tq3940

models models copied to clipboard

Unable to use "mixed_float16" in Object detect API

Prerequisites

1. The entire URL of the file you are using

2. Describe the bug

3. Steps to reproduce

4. Expected behavior

5. Additional context

6. System information

models
models copied to clipboard