AVSegFormer icon indicating copy to clipboard operation
AVSegFormer copied to clipboard

[AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer

Results 6 AVSegFormer issues
Sort by recently updated
recently updated
newest added

File "/home/hwh/Project/AVS/AVSegFormer-master/model/head/AVSegHead.py", line 238, in forward mask_feature = self.fusion_block(mask_feature, audio_feat) File "/home/hwh/anaconda3/envs/AVS39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/hwh/Project/AVS/AVSegFormer-master/model/utils/fusion_block.py", line 44, in forward fusion_map = torch.einsum('bchw,bc->bchw', feature_map, x.squeeze())...

Hello, this model is on the S4 data set, image size (224, 224), but the reproducible result is only 0.734. I did not modify the configuration file.

When training with the avss dataset, the audio_fea extracted by vggish is bs * 10 in the first dimension, which will not match the subsequent feature matrix with bs in...

When training the model on the AVSS Datasets, we find that the MIOU is about 20 with Res50 backbone and is about 30 with PVT-v2 backbone at 11 epochs. Could...

hey, thanks for your wonderful work. You mentioned that the gpu used is V100. I'm wondering if i can reproduce your work in few 2080ti?