tf-keras
tf-keras copied to clipboard
Tensorflow/Keras 2.10 initializer randomness changes : How to get 100% reproducible results?
Starting with tensorflow 2.10, the behavior of weight initialization has changed. Previously, you could get perfectly reproducibly results by simply setting the numpy / tensorflow global seeds, and the seeds for the data loader, without needing to dwell into the model code itself.
However, with 2.10, it is now required to pass a specific initializer with a seed attached , e.g. keras.layers.Dense(1, kernel_initializer=keras.initializers.GlorotUniformV2(seed=1)
. Obviously, this is a much cleaner choice from an API point of view, however the following problems occur when you want to achieve perfectly reproducable results.
1.) For many architectures, we can NOT modify the code (i.e. using for some built-in applications), and hence can not set initializer weights. Arguably, this is usually not the main use case as you'd load pre-trained weights, but even the classifier added on top wouldn't be fixed.
2.) When passing no initializer, the default type is derived based on the dtype / shape automatically. This means if I want to modify my existing architecture to include seeded initializers, I would first (for each layer!) check the default initializer classes, and then assign seeds to all involved initializers (which obviously is going to be a major effort)
3.) We are also not able to solve this by backtracking the keras graph, searching for initializer attributes and setting the seed / random generator attribute (see also my attempt below), as at the time where we may want to construct the Model(), the initializer was already called and weights created. This means we can only re-init the layer, as shown below.
4.) Even if we managed to automatize this, my understanding is that given an initializer (class), a fixed shape and a fixed seed, the values would always be the same. This means if by any chance we would have two times the same variable shape, setting all of them to the same seed would produce the same initializers, even if we don't necessarily want this. In this way, what we probably would need to do is to "hack" around this, and set the seed with something as hash(layer.name + model_seed)
Is there any more straight-forward way of achieving this? This may be better suited under feature request, with the feature being a concise way of ensuring model reproducability.
def set_initializer_seeds(inputs, outputs, seeds) -> "Model":
"""
Makes sure that every variable initializer has a certain seed attached for reproducibility
:param inputs:
:param outputs:
:param seeds:
:return:
"""
# pylint: disable=protected-access
from keras.models import Model
from keras.layers import Layer
from keras import backend as K
visited = set()
def _backtrack(cn):
idx = id(cn)
if idx in visited:
return
visited.add(idx)
layer: "Layer" = cn.node.layer
re_build = False
for attr in dir(layer):
if "initializer" in attr and not attr.startswith("_") and not attr.endswith("_"):
init = getattr(layer, attr, None)
if init is not None and hasattr(init, "seed") and init.seed is None:
hashed_seed = xxhash.xxh32(layer.name + attr + str(seeds)).intdigest()
init.seed = hashed_seed
if hasattr(init, "_random_generator"):
init._random_generator = K.RandomGenerator(
hashed_seed, rng_type="stateless"
)
re_build = True
if re_build:
layer._trainable_weights = []
layer._non_trainable_weights = []
layer.build(layer.input_shape)
for inbound in cn.node.keras_inputs:
_backtrack(inbound)
for o in outputs:
_backtrack(o)
return Model(inputs, outputs)