bayesian-neural-network-blogpost
bayesian-neural-network-blogpost copied to clipboard
Negative loss & logits_variance_loss
Hi, I have created a Bayesian CNN classifier as described in this repo, but my model's loss is always negative as well as the logits_variance_loss (see screenshot below). Any idea why is that happening?
@GKalliatakis Hi, can you please share the code for the loss function you wrote along with the training loop code.
The loss function is exactly the one described in this repo:
# Bayesian categorical cross entropy.
# N data points, C classes, T monte carlo simulations
# true - true values. Shape: (N, C)
# pred_var - predicted logit values and variance. Shape: (N, C + 1)
# returns - loss (N,)
def bayesian_categorical_crossentropy(T, num_classes):
def bayesian_categorical_crossentropy_internal(true, pred_var):
# shape: (N,)
std = K.sqrt(pred_var[:, num_classes:])
# shape: (N,)
variance = pred_var[:, num_classes]
variance_depressor = K.exp(variance) - K.ones_like(variance)
# shape: (N, C)
pred = pred_var[:, 0:num_classes]
# shape: (N,)
undistorted_loss = K.categorical_crossentropy(pred, true, from_logits=True)
# shape: (T,)
iterable = K.variable(np.ones(T))
dist = distributions.Normal(loc=K.zeros_like(std), scale=std)
monte_carlo_results = K.map_fn(gaussian_categorical_crossentropy(true, pred, dist, undistorted_loss, num_classes), iterable, name='monte_carlo_results')
variance_loss = K.mean(monte_carlo_results, axis=0) * undistorted_loss
return variance_loss + undistorted_loss + variance_depressor
return bayesian_categorical_crossentropy_internal
# for a single monte carlo simulation,
# calculate categorical_crossentropy of
# predicted logit values plus gaussian
# noise vs true values.
# true - true values. Shape: (N, C)
# pred - predicted logit values. Shape: (N, C)
# dist - normal distribution to sample from. Shape: (N, C)
# undistorted_loss - the crossentropy loss without variance distortion. Shape: (N,)
# num_classes - the number of classes. C
# returns - total differences for all classes (N,)
def gaussian_categorical_crossentropy(true, pred, dist, undistorted_loss, num_classes):
def map_fn(i):
std_samples = K.transpose(dist.sample(num_classes))
distorted_loss = K.categorical_crossentropy(pred + std_samples, true, from_logits=True)
diff = undistorted_loss - distorted_loss
return -K.elu(diff)
return map_fn
Then the model was compiled with the following settings (again as described in this repo):
# Compile the model using two losses, one is the aleatoric uncertainty loss function
# and the other is the standard categorical cross entropy function.
self.model.compile(
optimizer=Adam(lr=1e-3, decay=0.001),
# optimizer=SGD(lr=1e-5, momentum=0.9),
loss={'logits_variance': bayesian_categorical_crossentropy(self.monte_carlo_simulations, self.classes), # aleatoric uncertainty loss function
'softmax_output': 'categorical_crossentropy' # standard categorical cross entropy function
# 'softmax_output': standard_categorical_cross_entropy # standard categorical cross entropy function
},
metrics={'softmax_output': metrics.categorical_accuracy},
# the aleatoric uncertainty loss function is weighted less than the categorical cross entropy loss
# because the aleatoric uncertainty loss includes the categorical cross entropy loss as one of its terms.
loss_weights={'logits_variance': .2, 'softmax_output': 1.}
)
The only thing I am concerned about and is different from the implementation described here is the way raw images are fed in during training, because we have to do with a multi-output model and the author of this repo is dealing with a smaller dataset which allows him to do model.fit
In my case I have created a custom generator:
def multiple_outputs(generator, image_dir, batch_size, image_size, subset):
gen = generator.flow_from_directory(
image_dir,
target_size=(image_size, image_size),
batch_size=batch_size,
class_mode='categorical',
subset=subset)
while True:
gnext = gen.next()
# return image batch and 3 sets of lables
yield gnext[0], [gnext[1], gnext[1]]
which is used as follows:
datagen = ImageDataGenerator(rescale=1. / 255, validation_split=0.20)
custom_train_generator = multiple_outputs(generator = datagen,
image_dir = base_dir,
batch_size = train_batch_size,
image_size = img_width,
subset = 'training')
and then the custom generator is passed during fit:
history = self.model.fit_generator(custom_train_generator,
epochs=nb_of_epochs,
steps_per_epoch=steps_per_epoch,
validation_data=validation_data,
validation_steps=validation_steps,
callbacks=callbacks_list)
Any thoughts?
Hello @GKalliatakis, were you able to solve this issue?
Hello, I think the order of the arguments of the K.categorical_crossentropy
calls is wrong 🤔. In the Keras documentation, the arguments appear with y_true
as the first argument and y_pred
as the second.
undistorted_loss = K.categorical_crossentropy(true, pred, from_logits=True)
distorted_loss = K.categorical_crossentropy(true, pred + std_samples, from_logits=True)
Should the pred
and true
arguments be swapped?