EfficientNet-PyTorch icon indicating copy to clipboard operation
EfficientNet-PyTorch copied to clipboard

AutoEncoder using the EfficientNet

Open xingyaoww opened this issue 4 years ago • 5 comments

The AutoEncoder is implemented by reverse the forward EfficientNet as a decoder, current implementation only uses Dynamic Padding for TransposedConv2d which works fine for me now.

xingyaoww avatar Dec 20 '20 15:12 xingyaoww

Thanks for this PR! Very interesting. I'll have to think about whether this should be integrated into the main repo or whether it should be a standalone repo. Either way, we'll make sure the community can benefit from this good work!

I might be a bit slow to respond over the next week or two due to the holidays, so do not fret if that is the case.

lukemelas avatar Dec 23 '20 05:12 lukemelas

Thank you for your reply!

I just updated my implementation for AE with TransposedConv2dStaticSamePadding, since the original version didn't take odd image size into consideration: For example, when image size is changed from (29,29) to (15,15) by Conv2d, its reverse TransposedConv2d operation should convert image size (15,15) into (29,29) instead of (30,30).

The old implementation using TransposedConv2dDynamicSamePadding will convert image size into (30,30) and causing output shape issue. DynamicSamePadding only seems to work for efficientnet models with even image size (works for efficientnet-b0, but not efficientnet-b5), therefore, I am also removing TransposedConv2dDynamicSamePadding in recent commits.

xingyaoww avatar Dec 28 '20 15:12 xingyaoww

Hello. Will this be merged?

AFAgarap avatar Nov 21 '21 13:11 AFAgarap

Great Pull Request! I am trying EfficientNetAutoEncoder.from_pretrained(), and wondering below shape is correct or not. That's why, I have just learned autoencoder is unsupervised learning type so that input shape and output shape is the same. The autoencoder output for efficientnet-b0~7 is different as below. Could you tell me this is fine or bug? 0: input/(512,512) -> ae_output/(512,512) 1: input/(512,512) -> ae_output/(496,496) 2: input/(512,512) -> ae_output/(484,484) 3: input/(512,512) -> ae_output/(492,492) 4: input/(512,512) -> ae_output/(508,508) 5: input/(512,512) -> ae_output/(488,488) 6: input/(512,512) -> ae_output/(496,496) 7: input/(512,512) -> ae_output/(504,504) (I'm looking into the code, but it's difficult ;) Thanks in advance if you help me)

leejonggun avatar Dec 30 '21 11:12 leejonggun

Also looking forward to this PR being merged 👍

cwerner avatar Jun 04 '22 09:06 cwerner