keras
keras copied to clipboard
Duplicate logging messages after calling `model.save()`
System information.
- Have I written custom code (as opposed to using a stock example script provided in Keras): No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Issue on both CentOS8 Stream and Ubuntu 22.04
- TensorFlow installed from (source or binary): binary
- TensorFlow version (use command below): v2.9.0-18-gd8ce9f9c301 2.9.1
- Python version: 3.9
- Bazel version (if compiling from source): N/A
- GPU model and memory: A100 (40 GB)
- Exact command to reproduce:
import logging
import tensorflow as tf
logger = logging.getLogger('test')
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())
logger.info("Message (1) not duplicated")
model = tf.keras.applications.Xception()
logger.info("Message (2) not duplicated")
model.save('test')
logger.info("Message is duplicated")
Describe the problem.
When using a logger from python's logging module, messages are duplicated after a model is saved.
Describe the current behavior.
Running the above script will yield (some output not shown):
Message (1) not duplicated
Message (2) not duplicated
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 40). These functions will not be directly callable after loading.
Message is duplicated
INFO:test:Message is duplicated
Logging is not duplicated if logging.basicConfig() is called before model saving:
...
logger.setLevel(logging.INFO)
logger.basicConfig()
logger.info("Message (1) not duplicated")
...
This behavior is only observed in Tensorflow 2.9 and does not occur with version <2.8.
I suspect the issue is in tensorflow.python.saved_model.save:268 where a module-level logging.warning() function is called, which may be unintentionally adding a stream handler to the root logger if none is configured.
Describe the expected behavior.
Logging messages are not duplicated.
Standalone code to reproduce the issue.
An issue was first opened on the Tensorflow repo, and their team posted a gist which reproduces the issue here. They requested that the issue be moved to the keras repo.
@gadagashwini, I was able to reproduce the issue on tensorflow v2.8, v2.9 and nightly. Kindly find the gist of it here.
I will take a look.
Environment
I have written a script here which reproduces the problem in TF 2.7.3 / 2.8.2 / 2.9.1 via a simple Sequential model.
The problem can also be observed in some of the official TensorFlow guides, e.g. Using the SavedModel format:
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp_terpd74/mobilenet/1/assets
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp_terpd74/mobilenet/1/assets
Cause
As mentioned by @jamesdolezal, the cause appears to be misuse of the root logger in several of TensorFlow's Python modules, contrary to best practices for logging in Python libraries.
Offending modules include tensorflow.python.saved_model.save which calls absl.logging.warning directly (rather than using a specific logger): https://github.com/tensorflow/tensorflow/blob/v2.7.3/tensorflow/python/saved_model/save.py#L263
This results in a stream handler being added to the root logger (if no handler has been attached yet): https://github.com/abseil/abseil-py/blob/v1.1.0/absl/logging/__init__.py#L547
Potential solution
Replacing each import
from absl import logging
with
from tensorflow.python.platform import tf_logging as logging
should resolve the problem, as the latter routes messages through TensorFlow's logger instance (tf.get_logger()).