Junbum Cha comments

Results 14 comments of


                                            Junbum Cha

AttributeError: 'CIFAR10' object has no attribute 'train_data'

You should use torchvision==0.2.1. I will update requirements.txt soon.

Missing Softmax for Genotype selection

Thanks to your pointing out! I agree with you. I do not think this makes a big impact on the final performance, but it's obviously a bug logically. I will...

Where do I control the connections between nodes?

There are the candidate operations in models/ops.py.

DARTS: Differentiable Architecture Search Tutorial

1. web tutorial: http://khanrc.github.io/ 2. github: https://github.com/khanrc/pt.darts 3. docker image: khanrc/pytorch-darts:0.1 - https://cloud.docker.com/u/khanrc/repository/docker/khanrc/pytorch-darts

DARTS: Differentiable Architecture Search Tutorial

도커 이미지 웹 링크가 잘못되었네요. 룰이 Issue 수정 불가라서 참고하시라고 코멘트로 남깁니다. https://hub.docker.com/r/khanrc/pytorch-darts

Positional Encoding

It is learnable positional embedding, which is initialized randomly and learned during training. Please refer to the paper.

I don't get it, the code just provide CAbstracter + MLP and claims Resampler is not good?

I'm not quite sure I follow your point. First, C-Abstractor inherently consists of both convolution and MLP layers. Therefore, the code in question simply implements the C-Abstractor itself, not C-Abstractor...

I don't get it, the code just provide CAbstracter + MLP and claims Resampler is not good?

There are several points to address. **MLP or C-Abstractor usually better than Resampler.** In most cases, Resampler does not perform as well as MLP or C-Abstractor. Comparing them on MMB...

I don't get it, the code just provide CAbstracter + MLP and claims Resampler is not good?

First of all, I agree with @NormXU. I think that sole Resampler and C-Abs + Resampler would result in similar outcomes. As you mentioned, Resampler (trained with common web-crawled image-text...

I don't get it, the code just provide CAbstracter + MLP and claims Resampler is not good?

Yes, and partially no. In our experiments, scaling up to 50M pre-training samples (= 200k steps) does not impact the relative performance between C-Abs and Resampler. However, if you have...