deep-learning-sota icon indicating copy to clipboard operation
deep-learning-sota copied to clipboard

State-of-the-art results for deep learning tasks in various fields.

Deep Learning SotA

Note: This repository is no longer under support. Please refer to websites such as Paper with Code, which provide more comprehensive and up-to-date information on SOTA models. This repository is in archive mode now.

This repository lists the state-of-the-art results for mainstream deep learning tasks. We do our best to keep it up to date. If you do find a task's SotA result is outdated or missing, please raise an issue (with: title of paper, dataset, metric, source code, and year).

This summary is categorized into:

  • Computer Vision
  • Speech
  • NLP
  • Contact

Computer Vision

Classification

Dataset Type Top-1 accuracy Method Paper Code
ImageNet ResNet-50 78.35% ResNet-50 + DropBlock + label smoothing DropBlock: A Regularization Method for Convolutional Neural Networks
ImageNet Single model 82.52% AmoebaNet-B + DropBlock DropBlock: A Regularization Method for Convolutional Neural Networks

Object Detection

Dataset Type AP Method Paper Code
MS-COCO 2017 ResNet-101 43.4 D-RFCN + SNIP + ResNet-101 An Analysis of Scale Invariance in Object Detection - SNIP
MS-COCO 2017 Single model 45.7 D-RFCN + SNIP + DPN-98 An Analysis of Scale Invariance in Object Detection - SNIP

Instance Segmentation

Dataset Type AP Method Paper Code
MS-COCO 2018 Ensemble 48.6 mmdet + FishNet, 5 models - PyTorch

Visual Question Answering

Dataset Type Score Method Paper Code
VQA Ensemble 72.41 Pythia Pythia v0.1: The Winning Entry to the VQA Challenge 2018 PyTorch

Person Re-identification

Dataset Type Rank-1 accuracy Method Paper Code
Market-1501 Supervised single-query 91.2% Pixel-level attention + region-level attention + joint feature learning Harmonious Attention Network for Person Re-Identification
Market-1501 Supervised multi-query 93.8% Pixel-level attention + region-level attention + joint feature learning + multi-query Harmonious Attention Network for Person Re-Identification
DukeMTMC-reID Supervised single-query 85.95% SPReID Human Semantic Parsing for Person Re-identification

NLP

Language Modelling

Dataset Type Perplexity Method Paper Code
Penn Tree Bank 47.69 MoS Breaking the Softmax Bottleneck: A High-Rank RNN Language Model PyTorch
WikiText-2 40.68 MoS Breaking the Softmax Bottleneck: A High-Rank RNN Language Model PyTorch

Machine Translation

Dataset Type BLEU Method Paper Code
WMT 2014 English-to-French 41.4 Weighted Transformer Weighted Transformer Network for Machine Translation
WMT 2014 English-to-German 28.9 Weighted Transformer Weighted Transformer Network for Machine Translation

Text Classification

Dataset Type Accuracy Method Paper Code
Yelp 68.6% Learning Structured Text Representations TensorFlow

Natural Language Inference

Dataset Type Accuracy Method Paper Code
Stanford Natural Language Inference (SNLI) Single 89.9% GPT Improving Language Understanding by Generative Pre-Training
Stanford Natural Language Inference (SNLI) Emsemble 90.1% Semantic Sentence Matching with Densely-Connected Recurrent and Co-Attentive Information
MultiNLI Emsemble 86.7% BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Question Answering

Dataset Type F1 Method Paper Code
SQuAD 2.0 Single model 83.061 BERT-large BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Named Entity Recognition

Dataset Type F1 Method Paper Code
CoNLL-2003 Single model 92.8 BERT-large BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Speech

Acoustic Speech Recognition

Dataset Type WER Method Paper Code
Switchboard Hub5'00 Ensemble 5.0 biLSTM + CNN + Dense, 8 models The CAPIO 2017 Conversational Speech Recognition System

Contact

Email: [email protected]