Conditional AutoregressiveNetwork doesn't work with tfb.Chain
Dear all,
I am trying to implement a conditional MAF based on the example provided. It works fine when there is only one bijector used in TransformedDistribution, but as soon as a tfb.Chain is used it breaks due to not passing the bijector_kwargs correctly through the chain (I assume this is the issue). The error thrown is "ValueError: conditional_input must be passed as a named argument".
I am aware of the issue #1159, however I am trying to pass a distribution object through the chained bijector via a transformed distribution, so I don't think the solution given there applies. I am not sure if I try to do something that isn't meant to be done, if this is a bug, or if I simply do it wrong.
import numpy as np import tensorflow as tf import tensorflow_probability as tfp tfb = tfp.bijectors tfd = tfp.distributions tfkl = tf.keras.layers tfk = tf.keras
n = 2000 c = np.r_[ np.zeros(n//2), np.ones(n//2) ] mean_0, mean_1 = 0, 5 x = np.r_[ np.random.randn(n//2).astype(dtype=np.float32) + mean_0, np.random.randn(n//2).astype(dtype=np.float32) + mean_1 ]
made0 = tfb.AutoregressiveNetwork(params=2, hidden_units=[2, 2], event_shape=(1,), conditional=True, kernel_initializer=tfk.initializers.VarianceScaling(0.1), conditional_event_shape=(1,) ) made1 = tfb.AutoregressiveNetwork(params=2, hidden_units=[2, 2], event_shape=(1,), conditional=True, kernel_initializer=tfk.initializers.VarianceScaling(0.1), conditional_event_shape=(1,) )
tot_bijector = tfb.Chain([tfb.MaskedAutoregressiveFlow(made0), tfb.MaskedAutoregressiveFlow(made1)])
distribution = tfd.TransformedDistribution( distribution=tfd.Sample(tfd.Normal(loc=0., scale=1.), sample_shape=[1]), bijector=tot_bijector)
x_ = tfkl.Input(shape=(1,), dtype=tf.float32) c_ = tfkl.Input(shape=(1,), dtype=tf.float32) log_prob_ = distribution.log_prob( x_, bijector_kwargs={'conditional_input': c_}) model = tfk.Model([x_, c_], log_prob_)
model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.1), loss=lambda _, log_prob: -log_prob)
batch_size = 25 model.fit(x=[x, c], y=np.zeros((n, 0), dtype=np.float32), batch_size=batch_size, epochs=3, steps_per_epoch=n // batch_size, shuffle=True, verbose=True)
The solution from that issue applies here as well. Give your two made's a name (e.g. made0 and made1) and then do:
distribution.log_prob(x_,
bijector_kwargs=make_bijector_kwargs(distribution.bijector, {'made.': {'conditional_input': c_}})
Note how we use the regex made. to match both names and pass them the same conditional input.
@SiegeLordEx I am getting the same error, even after doing the above. Here is my code snippet:
import numpy as np import tensorflow as tf import tensorflow_probability as tfp tfb = tfp.bijectors tfd = tfp.distributions tfkl = tf.keras.layers tfk = tf.keras
n = 2000 c = np.r_[ np.zeros(n//2), np.ones(n//2) ] mean_0, mean_1 = 0, 5 x = np.r_[ np.random.randn(n//2).astype(dtype=np.float32) + mean_0, np.random.randn(n//2).astype(dtype=np.float32) + mean_1]
made0 = tfb.AutoregressiveNetwork(params=2, hidden_units=[2, 2], event_shape=(1,), conditional=True, kernel_initializer=tfk.initializers.VarianceScaling(0.1), conditional_event_shape=(1,), name='made0' ) made1 = tfb.AutoregressiveNetwork(params=2, hidden_units=[2, 2], event_shape=(1,), conditional=True, kernel_initializer=tfk.initializers.VarianceScaling(0.1), conditional_event_shape=(1,), name='made1' )
tot_bijector = tfb.Chain([tfb.MaskedAutoregressiveFlow(made0), tfb.MaskedAutoregressiveFlow(made1)])
distribution = tfd.TransformedDistribution(distribution=tfd.Sample(tfd.Normal(loc=0., scale=1.), sample_shape=[1]), bijector=tot_bijector) import re
def make_bijector_kwargs(bijector, name_to_kwargs):
if hasattr(bijector, 'bijectors'):
return {b.name: make_bijector_kwargs(b, name_to_kwargs) for b in bijector.bijectors}
else:
for name_regex, kwargs in name_to_kwargs.items():
if re.match(name_regex, bijector.name):
return kwargs
return {}
x_ = tfkl.Input(shape=(1,), dtype=tf.float32) c_ = tfkl.Input(shape=(1,), dtype=tf.float32)
#make_bijector_kwargs(distribution.bijector, {'made.': {'conditional_input': c_}})
log_prob_ = distribution.log_prob(x_, bijector_kwargs=make_bijector_kwargs(distribution.bijector, {'made.': {'conditional_input': c_}}))
model = tfk.Model([x_, c_], log_prob_)
Any help would be appreciated. Thanks!
I have it working now with the small adjustment of the Solution offered. Write tfb.MaskedAutoregresiveFlow(made0, name='maf0') and then use 'maf.' instead of 'made.' in the bijector kwargs. Thanks for your help.