addons
addons copied to clipboard
Error caused by SigmoidFocalCrossEntropy with kernel regularizer
System information
- OS: Linux Ubuntu 16.04:
- TensorFlow: tensorflow-gpu 2.2.0 installed via Anaconda (
conda install), binary - (Anaconda repository currently does not support a newer TensorFlow)
- TensorFlow-Addons: tensorflow-addons 0.11.2 via pip (
pip install tensorflow-addons==0.11.2) - pip was installed in this conda environment; newer
tfarequires newertf - Python version: 3.8
- Is GPU used? (yes/no): yes
- using
tf.keras, not standalonekeras
Describe the bug
I have L2 kernel regularizer set for some of the (keras) layers.
tfa.losses.SigmoidFocalCrossEntropy() was used as the loss function.
After the model being built and compiled, model.fit was called and the following exception occurred:
ValueError: Shapes must be equal rank, but are 1 and 0 From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](sigmoid_focal_crossentropy/weighted_loss/Mul, d1_7/kernel/Regularizer/add)' with input shapes: [?], [].
The full stack trace is too long and would be appended at the tail.
Code to reproduce the issue
Run the following code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense
import tensorflow_addons as tfa
model = keras.Sequential([
Dense(5, activation='relu', kernel_regularizer='l2', name='d1', input_shape=(12,)),
Dense(5, activation='softmax', name='dout')
])
model.compile(optimizer='adam', loss=tfa.losses.SigmoidFocalCrossEntropy(), metrics=['accuracy'])
model.summary()
# random data with desired shape was used to help with faster reproduction
model.fit(np.random.randn(64, 12), tf.one_hot(np.random.randint(0,5,64),5))
And the above mentioned exception popped out.
By removing kernel_regularizer='l2', the exception was gone and the training progress bar appeared as expected.
Other info / logs
Full stack trace: (You may want to skip it)
ValueError Traceback (most recent call last)
<ipython-input-3-cd3ce786e484> in <module>
11 model.compile(optimizer='adam', loss=tfa.losses.SigmoidFocalCrossEntropy(), metrics=['accuracy'])
12 model.summary()
---> 13 model.fit(np.random.randn(64, 12), np.random.randint(0,5,64))
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
64 def _method_wrapper(self, *args, **kwargs):
65 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
---> 66 return method(self, *args, **kwargs)
67
68 # Running inside `run_distribute_coordinator` already.
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
846 batch_size=batch_size):
847 callbacks.on_train_batch_begin(step)
--> 848 tmp_logs = train_function(iterator)
849 # Catch OutOfRangeError for Datasets of unknown size.
850 # This blocks until the batch has finished executing.
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
578 xla_context.Exit()
579 else:
--> 580 result = self._call(*args, **kwds)
581
582 if tracing_count == self._get_tracing_count():
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
625 # This is the first call of __call__, so we have to initialize.
626 initializers = []
--> 627 self._initialize(args, kwds, add_initializers_to=initializers)
628 finally:
629 # At this point we know that the initialization is complete (or less
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
503 self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph)
504 self._concrete_stateful_fn = (
--> 505 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access
506 *args, **kwds))
507
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
2444 args, kwargs = None, None
2445 with self._lock:
-> 2446 graph_function, _, _ = self._maybe_define_function(args, kwargs)
2447 return graph_function
2448
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
2775
2776 self._function_cache.missed.add(call_context_key)
-> 2777 graph_function = self._create_graph_function(args, kwargs)
2778 self._function_cache.primary[cache_key] = graph_function
2779 return graph_function, args, kwargs
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
2655 arg_names = base_arg_names + missing_arg_names
2656 graph_function = ConcreteFunction(
-> 2657 func_graph_module.func_graph_from_py_func(
2658 self._name,
2659 self._python_function,
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
979 _, original_func = tf_decorator.unwrap(python_func)
980
--> 981 func_outputs = python_func(*func_args, **func_kwargs)
982
983 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
439 # __wrapped__ allows AutoGraph to swap in a converted function. We give
440 # the function a weak reference to itself to avoid a reference cycle.
--> 441 return weak_wrapped_fn().__wrapped__(*args, **kwds)
442 weak_wrapped_fn = weakref.ref(wrapped_fn)
443
~/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
966 except Exception as e: # pylint:disable=broad-except
967 if hasattr(e, "ag_error_metadata"):
--> 968 raise e.ag_error_metadata.to_exception(e)
969 else:
970 raise
ValueError: in user code:
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:571 train_function *
outputs = self.distribute_strategy.run(
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:951 run **
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
return fn(*args, **kwargs)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:532 train_step **
loss = self.compiled_loss(
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/keras/engine/compile_utils.py:238 __call__
total_loss_metric_value = math_ops.add_n(loss_metric_values)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:180 wrapper
return target(*args, **kwargs)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py:3239 add_n
return gen_math_ops.add_n(inputs, name=name)
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py:419 add_n
_, _, _op, _outputs = _op_def_library._apply_op_helper(
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py:742 _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/func_graph.py:593 _create_op_internal
return super(FuncGraph, self)._create_op_internal( # pylint: disable=protected-access
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:3319 _create_op_internal
ret = Operation(
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:1816 __init__
self._c_op = _create_c_op(self._graph, node_def, inputs,
/home/omnisky/anaconda3/envs/tf2d1/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:1657 _create_c_op
raise ValueError(str(e))
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 0 with other shapes. for '{{node AddN}} = AddN[N=2, T=DT_FLOAT](sigmoid_focal_crossentropy/weighted_loss/Mul, d1/kernel/Regularizer/add)' with input shapes: [32], [].
Full stack trace compiled with run_eagerly=True may be provided if requested.
Thanks~
Sorry for the late reply. You have to use tfa.losses.SigmoidFocalCrossEntropy(reduction=tf.keras.losses.Reduction.AUTO) to reduce the loss to scalar. I'm not sure why we make it default to NONE. @AakashKumarNain Could you confirm that it's an issue or not?
We have a little bit of Doc here on the reductuion parameter: https://github.com/tensorflow/models/blob/master/official/vision/keras_cv/losses/focal_loss.py#L37
@WindQAQ Yes, that needs to be changed to AUTO and we need to make a few other changes as well. But I won't be able to fix it before next week.
I put this in the ecosystem review in the meantime cause I want to check how we want to handle this duplicated but not strictly aligned implementations.
Agreed
Sorry for the late reply. You have to use
tfa.losses.SigmoidFocalCrossEntropy(reduction=tf.keras.losses.Reduction.AUTO)to reduce the loss to scalar. I'm not sure why we make it default toNONE. @AakashKumarNain Could you confirm that it's an issue or not?
Thanks it works. QAQ
I am facing the same issue, however, using tfa.losses.SigmoidFocalCrossEntropy(reduction=tf.keras.losses.Reduction.AUTO) worked like a charm.
https://github.com/tensorflow/models/blob/master/official/vision/keras_cv/losses/focal_loss.py#L37
This link is not working. Can you please share the updated link?
@ravinderkhatri Keras-cv Is under refactoring.
We have a PR at https://github.com/tensorflow/addons/pull/2422
Having the same issue but setting reduction=tf.keras.losses.Reduction.AUTO fixed it. Surprised this isn't the default in tensorflow-addons
We have already official upstream APIs now: https://github.com/keras-team/keras-cv/issues/1117