Adaptive-Attention Some unexplained parameters in your code

Some unexplained parameters in your code

Open CSer-Tang-hao opened this issue 3 years ago • 1 comments

Excellent work, but the code is a bit messy, it's not friendly to a novice like me. Can you explain the meaning of some key parameters in the code? Such as "-d, double", "use_perm" and "use_inter_class". What are the differences between the "weighted" and "unweighted"? There are many details in the code that are not covered in the paper :)

Looking forward to your reply!

Jun 01 '21 15:06 CSer-Tang-hao

Sure, this work is done a very long time ago, and I'll take some time to clean the code in the future. Some ablation experiments are not included in our paper. For your questions

For "-d double", we ablate to use two reweighting module, one for generating the spatial attention map and one for classification.
For "use_permute" as indicated in paper Section 3.3, we use the symmetric form of the function, which means we also "permute" the role of query and reference and compute a set of new scores. Then two scores are merged together for the final prediction. You can regard it as kind of model ensemble.
For "inter_class_loss", you can refer to https://github.com/zihangJiang/Adaptive-Attention/blob/45eeb8fd629a81eebb3c8a8b869551f4f8738325/src/cfr_loss.py#L147-L148 which is the loss with regard to the reweighting weight. The aim is to force the Meta-weight of the same class also to be similar.
"weighted" means to use our reweighting strategy as explained in Section 3.2 in our paper.

Hope these answers your questions.

Jun 01 '21 17:06 zihangJiang

Adaptive-Attention Adaptive-Attention copied to clipboard

Some unexplained parameters in your code

Adaptive-Attention
Adaptive-Attention copied to clipboard