Donghyun Kim issues

Results 102 issues of


                                            Donghyun Kim

[98] ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks

[paper](https://arxiv.org/pdf/2102.11600.pdf) [code](https://github.com/SamsungLabs/ASAM) 역시 이런 류는 icml 이지.. 대부분의 sharpness 연구들은 scale dependency problem 이 있다. #126 요 문제를 해결하는 metric 들이 제안되었는데, ([1](https://arxiv.org/pdf/1711.01530.pdf), [2](https://proceedings.neurips.cc/paper/2019/file/9edda0fd4d983bf975935cfd492fd50b-Paper.pdf), [3](https://proceedings.mlr.press/v119/tsuzuku20a.html), [4](https://arxiv.org/pdf/1903.02237.pdf)) metric들을 제안한 paper들이 직접...

ICML21

Sharpness

Sharpness Aware Minimization

[97] Sharpness-Aware Minimization for Efficiently Improving Generalization (SAM)

[paper](https://arxiv.org/abs/2010.01412) [code](https://github.com/google-research/sam) ## Probably Approximately Correct (PAC) [John Mount 의 영상 설명](https://www.youtube.com/watch?v=X4Oxst5huQA) [전상혁님 블로그](http://sanghyukchun.github.io/66/) 전상혁님의 PAC learning 글을 5번은 보았던 것 같다. 4년 전에는 까막눈으로 보았었고, 매년 PAC learning 개념이...

ICLR21

Sharpness

Sharpness Aware Minimization

[96] Sharp Minima Can Generalize For Deep Nets

Sharpness 를 정의하는 다양한 방법들이 잘못되었다고 지적한 논문 [paper](https://arxiv.org/pdf/1703.04933.pdf) [훌륭한 영상 설명 - 딥러닝논문읽기모임](https://www.youtube.com/watch?v=5E9SFe5WU1s) ## Definitions of flatness/sharpness 몇 가지 sharpness 정의 방법들을 우선 살펴보자 ### volume $\epsilon$-flatness hochreiter 센세의...

ICML17

Sharpness

[95] CoCa: Contrastive Captioners are Image-Text Foundation Models

[paper](https://arxiv.org/pdf/2205.01917.pdf) imagenet sota 를 찍으며 큰 화제가 되었던 녀석. 한 번의 pretraining 으로 다양한 task를 풀어낸다. contrastive loss 와 captioning loss 를 동시에 사용하는 형태. ![image](https://user-images.githubusercontent.com/16400591/175843886-8b0dd4aa-5191-44e0-8efe-eb1ac1863662.png) # Pretraining ![image](https://user-images.githubusercontent.com/16400591/175845821-9e769a4f-bb46-4901-8195-f58f69efbf42.png) ![image](https://user-images.githubusercontent.com/16400591/175845836-8e2ebdf3-e41d-4fba-912b-47d6feca22a8.png)...

Google

Pretraining

Vision-Language

[93] Grounded Language-Image Pre-training (GLIP)

[paper](https://arxiv.org/pdf/2112.03857.pdf) [code](https://github.com/microsoft/GLIP) object detection 과 phrase grounding 을 합친 pretraining 을 제안. 코드베이스는 애초에 mask-rcnn 을 이용한 형태. ![image](https://user-images.githubusercontent.com/16400591/175812784-c18c382e-990f-49dc-af85-b7bcac0b7a0f.png) 모델의 zero-shot 결과. # 선행지식 ## Dynamic Head 당시 detection sota...

Detection

Microsoft

Pretraining

CVPR22

Vision-Language

[94] Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

[paper](https://arxiv.org/pdf/2206.08916.pdf) 큰 scale 로 seq2seq 형태의 모델을 train 시킴. GRIT benchmark 라고, AllenAI 쪽에서 공개한 multi-task benchmark 가 있는데, 하나의 모델로 이들 전부 task 를 풀 수 있는 work 는...

AllenAI

Pretraining

Vision-Language

[92] Revisiting Multi-Scale Feature Fusion for Semantic Segmentation

[paper](https://arxiv.org/abs/2203.12683) 정말 단순하다. segmentation 에는 P9까지 쓰면 좋다구~~~ 굳이 ASPP 같은 모듈 쓸 필요 없다. (이게 cost 가 크다) 끝. ![image](https://user-images.githubusercontent.com/16400591/164726195-74bad6f9-6c6c-4452-9efa-138d64004871.png) 성능 비교. ![image](https://user-images.githubusercontent.com/16400591/164726059-3e2847e1-70b1-47a2-8aa5-3aea581e0e92.png) 확실히 semantic segmentation 같은 task 는...

Google

[91] Three things everyone should know about Vision Transformers

[paper](https://arxiv.org/abs/2203.09795) Touvron 의 신작. 1. 모듈들을 병렬로 잘 연결하자 2. data 가 적으면 mhsa 만 튜닝해도 좋다 3. 16 patchify layer 를 resnet-d 처럼 작은 stride 를 갖는 레이어들로 쪼개면...

Meta AI

[90] Exploring Plain Vision Transformer Backbones for Object Detection

[paper](https://arxiv.org/abs/2203.16527) Object Detection 을 위해서 ViT 는 굳이 hierarchical 하게 갈 필요 없다. hierarchical 하게 가지 않을 경우, pretrained model 을 사용하기에도 용이해진다. 논문 뒷편에서는 ViT 를 MAE 로 pretrain...

FAIR

[89] Sparse Instance Activation for Real-Time Instance Segmentation (SparseInst)

[paper](https://arxiv.org/abs/2203.12827) [code](https://github.com/hustvl/SparseInst) 새롭게 real-time 류 instance segmentation 에서 SOTA 찍은 논문. ![image](https://user-images.githubusercontent.com/16400591/164712310-55efe6c0-246f-421e-a18d-57258d95fb4f.png) # Intro Region based 나 dynamic convolution 등 다양한 instance seg 방법론들은 real-time 영역으로 넘어오면 잘 동작시키기 힘들다....

Instance Segmentation

CVPR22