SCAN
SCAN copied to clipboard
Why does SCAN enlarge the image four times and only low pixel images work well?
I use batchsize=1 to train the model, but only dozens of K images can produce results when reasoning. For example, 2M images will show insufficient memory. Why is this?