Donghyun Kim issues

Results 102 issues of


                                            Donghyun Kim

[78] Understanding Failure Modes of Self-Supervised Learning

from `weekly arxiv AI 만담` ``` SSL로 학습된 모델에서 오분류되는 이미지들을 평가하는 Q-score 제안 제대로 분류되는 이미지는 sparse & 특징이 드러나는 feature들이 존재. 잘못 분류되는 이미지 피처는 그 반대 Encoder를...

Self-Supervised

Meta AI

[79] Visualizing the Loss Landscape of Neural Nets

[paper](https://arxiv.org/pdf/1712.09913.pdf) [code](https://github.com/tomgoldstein/loss-landscape) ![image](https://user-images.githubusercontent.com/16400591/158041025-025dc500-0488-41e4-9b9f-6367ebd4a27a.png) ## Contribution - 다양한 loss function의 시각화 방법들의 단점을 드러내고, 간단한 시각화 방법들의 loss function minimizer의 local geometry(sharpness or flatness)를 정확하게 capture 못하는 것을 보여줌. - “filter...

NeurIPS18

[77] DeepNet: Scaling Transformers to 1,000 Layers

[paper](https://arxiv.org/abs/2203.00555) [code](https://github.com/microsoft/unilm) `rosinality`'s comment ``` pre norm의 안정성과 post norm의 성능. post norm을 사용하는 경우 학습 초반에 큰 크기의 업데이트가 발생하고 이 업데이트가 layer norm과 맞물려 gradient vanishing이 발생 학습...

Microsoft

[69] How do vision transformers work? (AlterNet)

왜 ViT 가 잘 working 할까에 대해 연구한 논문. [paper](https://arxiv.org/abs/2202.06709) 일반적으로 생각하는 MSA 가 좋은 이유 ``` MSA 의 어떤 부분이 모델을 위해 좋을까? ==> long range dependency MSA가 conv...

BackBone

ICLR22

[76] Visual Attention Network (VAN)

[paper](https://arxiv.org/pdf/2202.09741.pdf) [code](https://github.com/Visual-Attention-Network) ![image](https://user-images.githubusercontent.com/16400591/155637436-bc3c8b0a-4582-49e4-b36d-13157158694a.png) large kernel attention (LKA) 제안, SOTA 달성 ## Large Kernel Attention (LKA) 엄청 간단한 아이디어다. 아래 그림은 13x13 conv 를 나눈 그림이다. 노란색이 center point. kernel 의...

BackBone

[75] Vision-Language Pre-Training with Triple Contrastive Learning

[rosinality](https://github.com/rosinality)'s comment ``` vision-language 모델들이 서로 다른 modality의 alignment에 대해서는 많이 탐색했는데 개별 modal에 대한 representation 학습은 없지 않았나 하는 아이디어. 그래서 cross modal align + intra modal align을 기본으로...

Amazon

Pretraining

MultiTask

MultiModal

[74] Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations (EVIT)

[paper](https://arxiv.org/pdf/2202.07800.pdf) [code](https://github.com/youweiliang/evit) attention 좀 더 효율적으로 수행하자. (필요한 것만 쓰자!) ![image](https://user-images.githubusercontent.com/16400591/155251184-5e5ae419-1ddc-4fc3-a5bd-143055447e40.png) # Token Reorganization image token 들을 identify (background or object) 하고, fusing 하는 방법. ![image](https://user-images.githubusercontent.com/16400591/155248002-4e72c343-ea5a-401b-bbd1-8679825902b8.png) ## Attentive Token Identification...

Tencent

Light Attention

Efficient

ICLR22

[73] Why Do Better Loss Functions Lead to Less Transferable Features?

ImageNet 성능이 좋다고 transferability가 좋은게 아님. 9개의 loss 에 따라 transferability 가 어떻게 달라지는지 분석해 보자. [paper](https://arxiv.org/pdf/2010.16402.pdf) [sungchul.kim review KR (notion)](https://www.notion.so/Why-Do-Better-Loss-Functions-Lead-to-Less-Transferable-Features-75d77d79294241c1b3f043740b726858) # INTRO 1. objective는 network layer의 representation이 output과 얼마나...

Google

Pretraining

NeurIPS21

XAI

[71] SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

[paper](https://arxiv.org/pdf/1706.05806.pdf) [Jay Alammar youtube](https://www.youtube.com/watch?v=u7Dvb_a1D-0) layer 별 유사도를 측정하자! 이를 통해 네트워크가 overparameterize 된 것은 아닌지, 학습은 어떻게 진행되는지 등을 볼 수 있다. SV 는 Singular Vector decomposition 을 해서 붙었다....

Google

NeurIPS17

XAI

[70] Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision (SEER)

[paper](https://arxiv.org/pdf/2202.08360.pdf) 10B Dense model 에다가 데이터 때려박자는 논문. ssl 방식은 SWAV ## 데이터 non-EU 데이터 (GDPR 이라고 EU 개인정보 보호) Instagram 에서 긁어모음 데이터는 unfiltered 상태, 총 양은 1B 분석은...

Pretraining

Meta AI