Deep Learning SotA

Note: This repository is no longer under support. Please refer to websites such as Paper with Code, which provide more comprehensive and up-to-date information on SOTA models. This repository is in archive mode now.

This repository lists the state-of-the-art results for mainstream deep learning tasks. We do our best to keep it up to date. If you do find a task's SotA result is outdated or missing, please raise an issue (with: title of paper, dataset, metric, source code, and year).

This summary is categorized into:

Computer Vision
Speech
NLP
Contact

Computer Vision

Classification

Dataset	Type	Top-1 accuracy	Method	Paper	Code
ImageNet	ResNet-50	78.35%	ResNet-50 + DropBlock + label smoothing	DropBlock: A Regularization Method for Convolutional Neural Networks
ImageNet	Single model	82.52%	AmoebaNet-B + DropBlock	DropBlock: A Regularization Method for Convolutional Neural Networks

Object Detection

Dataset	Type	AP	Method	Paper	Code
MS-COCO 2017	ResNet-101	43.4	D-RFCN + SNIP + ResNet-101	An Analysis of Scale Invariance in Object Detection - SNIP
MS-COCO 2017	Single model	45.7	D-RFCN + SNIP + DPN-98	An Analysis of Scale Invariance in Object Detection - SNIP

Instance Segmentation

Dataset	Type	AP	Method	Paper	Code
MS-COCO 2018	Ensemble	48.6	mmdet + FishNet, 5 models	-	PyTorch

Visual Question Answering

Dataset	Type	Score	Method	Paper	Code
VQA	Ensemble	72.41	Pythia	Pythia v0.1: The Winning Entry to the VQA Challenge 2018	PyTorch

Person Re-identification

Dataset	Type	Rank-1 accuracy	Method	Paper
Market-1501	Supervised single-query	91.2%	Pixel-level attention + region-level attention + joint feature learning	Harmonious Attention Network for Person Re-Identification
Market-1501	Supervised multi-query	93.8%	Pixel-level attention + region-level attention + joint feature learning + multi-query	Harmonious Attention Network for Person Re-Identification
DukeMTMC-reID	Supervised single-query	85.95%	SPReID	Human Semantic Parsing for Person Re-identification

NLP

Language Modelling

Dataset	Type	Perplexity	Method	Paper	Code
Penn Tree Bank		47.69	MoS	Breaking the Softmax Bottleneck: A High-Rank RNN Language Model	PyTorch
WikiText-2		40.68	MoS	Breaking the Softmax Bottleneck: A High-Rank RNN Language Model	PyTorch

Machine Translation

Dataset	Type	BLEU	Method	Paper	Code
WMT 2014 English-to-French		41.4	Weighted Transformer	Weighted Transformer Network for Machine Translation
WMT 2014 English-to-German		28.9	Weighted Transformer	Weighted Transformer Network for Machine Translation

Text Classification

Dataset	Type	Accuracy	Method	Paper	Code
Yelp		68.6%		Learning Structured Text Representations	TensorFlow

Natural Language Inference

Dataset	Type	Accuracy	Method	Paper
Stanford Natural Language Inference (SNLI)	Single	89.9%	GPT	Improving Language Understanding by Generative Pre-Training
Stanford Natural Language Inference (SNLI)	Emsemble	90.1%		Semantic Sentence Matching with Densely-Connected Recurrent and Co-Attentive Information
MultiNLI	Emsemble	86.7%		BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Question Answering

Dataset	Type	F1	Method	Paper	Code
SQuAD 2.0	Single model	83.061	BERT-large	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Named Entity Recognition

Dataset	Type	F1	Method	Paper	Code
CoNLL-2003	Single model	92.8	BERT-large	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Speech

Acoustic Speech Recognition

Dataset	Type	WER	Method	Paper	Code
Switchboard Hub5'00	Ensemble	5.0	biLSTM + CNN + Dense, 8 models	The CAPIO 2017 Conversational Speech Recognition System

Contact

Email: [email protected]

deep-learning-sota
deep-learning-sota copied to clipboard

Metadata

Deep Learning SotA

Computer Vision

Classification

Object Detection

Instance Segmentation

Visual Question Answering

Person Re-identification

NLP

Language Modelling

Machine Translation

Text Classification

Natural Language Inference

Question Answering

Named Entity Recognition

Speech

Acoustic Speech Recognition

Contact

← Metadata

Owner

Metadata

deep-learning-sota deep-learning-sota copied to clipboard

Metadata

Deep Learning SotA

Computer Vision

Classification

Object Detection

Instance Segmentation

Visual Question Answering

Person Re-identification

NLP

Language Modelling

Machine Translation

Text Classification

Natural Language Inference

Question Answering

Named Entity Recognition

Speech

Acoustic Speech Recognition

Contact

← Metadata

Owner

Metadata

deep-learning-sota
deep-learning-sota copied to clipboard