MoGe icon indicating copy to clipboard operation
MoGe copied to clipboard

Question about resamplers in V2

Open LinZhuoChen opened this issue 5 months ago • 2 comments

Thank you for your awesome work!

I want to know which type of resample is used in mogev2?

['pixel_shuffle', 'nearest', 'bilinear', 'conv_transpose', 'pixel_unshuffle', 'avg_pool', 'max_pool']

Looking for your reply

LinZhuoChen avatar Jul 08 '25 11:07 LinZhuoChen

You can find this by loading their pretrained checkpoint.

pkqbajng avatar Jul 09 '25 06:07 pkqbajng

We use conv_transpose for the first three upsamplers and bilinear for the last one, as specified in the model configuration from the pretrained checkpoint:

"resamplers": ["conv_transpose", "conv_transpose", "conv_transpose", "bilinear"]

Note that these choices are not critical to the model’s performance. We experimented with different upsampling methods and attempted to search for the optimal combination, but the results showed only minor differences in accuracy and convergence. Therefore, the current configuration is selected mainly for its simplicity and consistency with MoGe-1 to reduce the confusion.

EasternJournalist avatar Jul 09 '25 07:07 EasternJournalist