Multimodal-Learning-with-Alternating-Unimodal-Adaptation Problem on calculating the entropy

Thanks for your exciting work in dealing with the problem of multimodal imbalance. However, I met some troubles when running the code. In main.py, the code for calculating entropy is shown below:

def calculate_entropy(output):
    probabilities = F.softmax(output, dim=0)
    # probabilities = F.softmax(output, dim=1)
    log_probabilities = torch.log(probabilities)
    entropy = -torch.sum(probabilities * log_probabilities)
    return entropy

The size of the output is [B, N] (B for batch and N for the numbers of categories). According to the Equation (8) in the paper, the correct code to get entropy should be:

def calculate_entropy(output):
    # probabilities = F.softmax(output, dim=0)
    probabilities = F.softmax(output, dim=1) # Softmax through categories
    log_probabilities = torch.log(probabilities)
    entropy = -torch.sum(probabilities * log_probabilities, dim=-1) # If you don't add "dim=-1" then you sum up the entropy in the whole batch
    return entropy

What makes a difference here is entropy should be calculated for each sample, but not be summed up in a batch.

Oh! By the way, I think Equation (8) looks a little bit odd as 'm' shows up on the left side of the equation and serves as indices of the 'max' operation.

Equation 8 in the paper

Mar 20 '24 07:03 hubaak

I think you're right. When I run -- dynamic, the effect is even worse than the fixed fusion method.

Mar 20 '24 13:03 LittlePoolSpirit

Hi @hubaak, yeah, it seems we may have some mistakes on entropy calculation, we will do some carefully check on it and make some correction.

In addition, thank you for your kind suggestion for the eq.8. In this eq, we want to use "m" to represent modality, however I think you are right, it looks a bit ambiguous, I will discuss this will my Prof and make any essential correction in the camera-ready paper.

Mar 20 '24 13:03 Cecile-hi

Hi @LittlePoolSpirit, I think decreasing the eval batch szie may make some effect (e.g. --batch_size 1). For example, using the released ckpt With Fixed weight:

python main.py --ckpt_path best_model_of_dataset_CREMAD_Normal_alpha_0.3_optimizer_sgd_modulate_starts_0_ends_50_epoch_91_acc_0.7768817204301075.pth --gs_flag --dataset CREMAD --batch_size 1 --lorb base --av_alpha 0.55

We have: Accuracy: 0.7768817204301075, accuracy_a: 0.5981182795698925, accuracy_v: 0.668010752688172

--dyanmic (with the modification by @hubaak )

python main.py --ckpt_path best_model_of_dataset_CREMAD_Normal_alpha_0.3_optimizer_sgd_modulate_starts_0_ends_50_epoch_91_acc_0.7768817204301075.pth --gs_flag --dataset CREMAD --batch_size 1 --lorb base --dynamic

We have: Accuracy: 0.7876344086021505, accuracy_a: 0.5981182795698925, accuracy_v: 0.668010752688172

Mar 20 '24 13:03 Cecile-hi

Good idea, I met the same problem, too.

Apr 23 '24 12:04 thinking024

i just find that the eq.8 can be simplifed as follow: it just may be a softmax as follow:

Oct 18 '24 03:10 ggamaz

i just find that the eq.8 can be simplifed as follow: it just may be a softmax as follow:

Yeah, they simply weight the outputs of models with their minus entropy after softmax.

Oct 21 '24 12:10 hubaak

Multimodal-Learning-with-Alternating-Unimodal-Adaptation Multimodal-Learning-with-Alternating-Unimodal-Adaptation copied to clipboard

Problem on calculating the entropy

Multimodal-Learning-with-Alternating-Unimodal-Adaptation
Multimodal-Learning-with-Alternating-Unimodal-Adaptation copied to clipboard