awesome-pretrained-models-for-information-retrieval icon indicating copy to clipboard operation
awesome-pretrained-models-for-information-retrieval copied to clipboard

A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).


logo of awesome repository

awesome-pretrained-models-for-information-retrieval

A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pre-training for IR). If I missed any papers, feel free to open a PR to include them! And any feedback and contributions are welcome!

Pre-training for IR

  • Survey Papers

  • Phase 1: First-stage Retrieval

    • Sparse Retrieval
      • Neural term re-weighting
      • Query or document expansion
      • Sparse representation learning
    • Dense Retrieval
      • Hard negative sampling
      • Late interaction and multi-vector representation
      • Knowledge distillation
      • Domain adaptation
      • Jointly learning retrieval and indexing
      • Pre-training tailored for dense retrieval
    • Combining Sparse Retrieval and Dense Retrieval
  • Phase 2: Re-ranking Stage

    • Basic Usage
      • Discriminative ranking models
      • Generative ranking models
      • Hybrid ranking models
    • Long Document Processing Techniques
      • Passage score aggregation
      • Passage representation aggregation
      • Designing new architectures
    • Improving Efficiency
      • Decoupling the interaction
      • Knowledge distillation
      • Early exit
    • Query Expansion
    • Partial Fine-tuning
    • Re-weighting Training Samples
    • Pre-training Tailored for Re-ranking
    • Cross-lingual Retrieval
  • Jointly Learning to Retrieve and Re-rank

  • Model-based IR System

  • Multimodal Retrieval

    • Unified Single-stream Architecture
    • Multi-stream Architecture Applied on Input
  • Other Resources

Survey Papers

First Stage Retrieval

Sparse Retrieval

Neural term re-weighting

Query or document expansion

Sparse representation learning

Dense Retrieval

Hard negative sampling

Late interaction and multi-vector representation

Knowledge distillation

Jointly learning retrieval and indexing

Domain adaptation

Pre-training tailored for dense retrieval

Combining Sparse Retrieval and Dense Retrieval

Re-ranking Stage

Basic Usage

Discriminative ranking models

Representation-focused
Interanction-focused

Generative ranking models

Hybrid ranking models

Long Document Processing Techniques

Passage score aggregation

Passage representation aggregation

Designing new architectures

Improving Efficiency

Decoupling the interaction

Knowledge distillation

Early exit

Re-weighting Training Samples

Query Expansion

Partial Fine-tuning

Pre-training Tailored for Re-ranking

Cross-lingual Retrieval

Jointly Learning to Retrieve and Re-rank

Model-based IR System

Multimodal Retrieval

Unified Single-stream Architecture

Multi-stream Architecture Applied on Input

Other Resources

Some Retrieval Toolkits

Other Resources About Pre-trained Models in NLP

Surveys About Efficient Transformers