segmenter
segmenter copied to clipboard
Can you use swin transformer as encoder?
As in title, can you replace the standard ViT encoder with a swin transformer + FPN? Would this be a reasonable thing to try out?
You could probably replace ViT with a swin encoder indeed :) You could get improved performance given the downstream performances of Swin that are better than ViT on some tasks.