VSS-CFFM Dose the test reuslts on several images rather than whole videos represent the performance of video semantic segmentation methods?

Dose the test reuslts on several images rather than whole videos represent the performance of video semantic segmentation methods?

Open imzhangyd opened this issue 1 year ago • 1 comments

Here is a problem I'm confusing. The task of video semantic segmentation is to segment each frame of videos. But only several frames are labeled in the test set, the test performance in experiments is on several images rather than whole videos. I think it can not represent the performance of video semantic segmentation methods. Did I misunderstand something here?

Oct 09 '22 07:10 imzhangyd

Hi, thanks for your interest. For VSPW dataset, the test performance is on the whole videos, rather than several images. For cityscapes, it is true that the test performance is on images, rather than whole videos. Your concern is reasonable. This is why we conduct most of our experiments on VSPW dataset, rather than cityscapes. Previously, there is no fully annotated dataset for video semantic segmentation, so researchers use cityscapes for experiments.

Oct 10 '22 08:10 GuoleiSun

VSS-CFFM VSS-CFFM copied to clipboard

Dose the test reuslts on several images rather than whole videos represent the performance of video semantic segmentation methods?

VSS-CFFM
VSS-CFFM copied to clipboard