ICL_PaperList icon indicating copy to clipboard operation
ICL_PaperList copied to clipboard

Paper List for In-context Learning 🌷

Paper List for In-context Learning

Contents

  • Paper List for In-context Learning
    • Introduction
      • Keywords Convention
    • Papers
      • Model Warmup for ICL
      • Prompt Tuning for ICL
        • Prompt Selection Strategies for LLMs
        • Prompt Formulation Strategies for LLMs
      • Analysis of ICL
        • Influence Factors for ICL
        • Working Mechanism of ICL
      • Evaluation and Resources
      • Application
      • Problems
      • Challenges and Future Directions
      • How to contribute?

Introduction

This is a paper list about In-context learning, for the following paper:

A Survey for In-context Learning,
Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, Lei Li, Zhifang Sui
arXiv preprint (arXiv 2301.00234)

Keywords Convention

abbreviation

section in our survey

main feature

conference

Papers

Model Warmup for ICL

This section contains the pilot works that might contributes to the warmup strategies of ICL.

  1. MetaICL: Learning to Learn In Context NAACL 2022 a pretrained language model is tuned to do in-context learning on a large set of training tasks.

    Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi. [pdf], [project], 2021.10,

  2. Improving In-Context Few-Shot Learning via Self-Supervised Training.

    Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva. [pdf], [project], 2022.5,

  3. Calibrate Before Use: Improving Few-shot Performance of Language Models.

    Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh. [pdf], [project], 2021.2,

    • Using N/A string to calibrate LMs away from common token bias

Prompt Tuning for ICL

This section contains the pilot works that might contributes to the prompt selection and prompt formulation strategies of ICL.

  1. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model.

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woomyoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.04,

    • how in-context learning performance changes as the training corpus varies, investigate the effects of the source and size of the pretraining corpus on in-context learning
  2. Chain of Thought Prompting Elicits Reasoning in Large Language Models.

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou. [pdf], [project], 2022.01,

  3. Least-to-Most Prompting Enables Complex Reasoning in Large Language Models.

    Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi. [pdf], [project], 2022.05,

  4. Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator.

    Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee. [pdf], [project], 2022.06,

  5. Iteratively Prompt Pre-trained Language Models for Chain of Thought.

    Boshi Wang, Xiang Deng, Huan Sun. [pdf], [project], 2022.03,

  6. Automatic Chain of Thought Prompting in Large Language Models.

    Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola. [pdf], [project], 2022.10,

  7. Learning To Retrieve Prompts for In-Context Learning NAACL 2022 Learn an example retriever via contrastive learning.

    Ohad Rubin, Jonathan Herzig, Jonathan Berant. [pdf], [project], 2022.12,

  8. Finetuned Language Models Are Zero-Shot Learners instruction tuning.

    Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le. [pdf], [project], 2021.09,

    • finetuning language models on a collection of tasks described via instructions
    • substantially improves zero-shot performance on unseen tasks
  9. Active Example Selection for In-Context Learning.

    Yiming Zhang, Shi Feng, Chenhao Tan. [pdf], [project], 2022.11,

  10. Prompting GPT-3 To Be Reliable establish simple and effective prompts to demonstrate GPT-3's reliability in these four aspects

  11. An lnformation-theoretic Approach to Prompt Engineering Without Ground Truth Labels

  12. Self-adaptive In-context Learning

  13. Demystifying Prompts in Language Models via Perplexity Estimation

  14. Structured Prompting: Scaling In-Context Learning to 1,000 Examples

  15. Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity.

Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp. [pdf], [project], 2021.04,

  1. On the Relation between Sensitivity and Accuracy in In-context Learning.

Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He. [pdf], [project], 2022.09,

  1. Can language models learn from explanations in context?.

Andrew K. Lampinen, Ishita Dasgupta, Stephanie C. Y. Chan, Kory Matthewson, Michael Henry Tessler, Antonia Creswell, James L. McClelland, Jane X. Wang, Felix Hill. [pdf], [project], 2022.04

Analysis of ICL

This section contains the pilot works that might contributes to the influence factors and working mechanism analysis of ICL.

Influence Factors for ICL

  1. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

    Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. [pdf], [project], 2022.03,

  2. What Makes Good In-Context Examples for GPT-3?

    Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen. [pdf], [project], 2022.08,

  3. Emergent Abilities of Large Language Models

    Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus. [pdf], [project], 2022.07,

  4. Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

    Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim. [pdf], [project], 2022.05,

  5. On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

    Seongjin Shin, Sang-Woo Lee, Hwijeen Ahn, Sungdong Kim, HyoungSeok Kim, Boseop Kim, Kyunghyun Cho, Gichang Lee, Woo-Myoung Park, Jung-Woo Ha, Nako Sung. [pdf], [project], 2022.08,

Working Mechanism of ICL

  1. An Explanation of In-context Learning as Implicit Bayesian Inference

    Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma. [pdf], [project], 2022.08,

  2. In-context Learning and Induction Heads

    Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah. [pdf], [project], 2022.10,

  3. What Can Transformers Learn In-Context? A Case Study of Simple Function Classes

    Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant. [pdf], [project], 2022.08,

  4. "Data Distributional Properties Drive Emergent In-Context Learning in Transformers"

    Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill. [pdf], [project], 2022.05,

  5. What learning algorithm is in-context learning? Investigations with linear models

    Ekin Akyürek, Dale Schuurmans, Jacob Andreas, Tengyu Ma, Denny Zhou. [pdf], [project], 2022.11,

  6. Transformers learn in-context by gradient descent

von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov. [pdf], [project], 2022.12,

  1. Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei. [pdf], [project], 2022.12

Evaluation and Resources

This section contains the pilot works that might contributes to the evaluation or resources of ICL.

  1. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

    Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt et. al.. [pdf], [project], 2022.06,

  2. SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Task.

    Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit et. al.. [pdf], [project], 2022.04,

  3. Language Models are Multilingual Chain-of-Thought Reasoners.

    Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei. [pdf], [project], 2022.10,

    • evaluate the reasoning abilities of large language models in multilingual settings, introduce the Multilingual Grade School Math (MGSM) benchmark, by manually translating 250 grade-school math problems from the GSM8K dataset into ten typologically diverse languages.
  4. Instruction Induction: From Few Examples to Natural Language Task Descriptions.

    Or Honovich, Uri Shaham, Samuel R. Bowman, Omer Levy. [pdf], [project], 2022.05,

    • how to learn task instructions from input output demonstrations
  5. Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought2022.10.3

  6. What is Not in the Context? Evaluation of Few-shot Learners with Informative Demonstrations 2212.01692.pdf (arxiv.org)

Application

This section contains the pilot works that expands the application of ICL.

  1. Meta-learning via Language Model In-context Tuning.

    Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He. [pdf], [project], 2021.10,

  2. Does GPT-3 Generate Empathetic Dialogues? A Novel In-Context Example Selection Method and Automatic Evaluation Metric for Empathetic Dialogue Generation.

    Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi. [pdf], [project], 2022.10,

  3. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models.

    Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown. pdf, [project], 2022.12,

  4. In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models

  5. Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Problems

This section contains the pilot works that points out the problems of ICL.

  1. The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design .

    Yoav Levine, Noam Wies, Daniel Jannai, Dan Navon, Yedid Hoshen, Amnon Shashua. [pdf], [project], 2021.10,

Challenges and Future Directions

This section contains the pilot works that might contributes to the challenges and future directions of ICL.

How to contribute?

  • Add new papers to the corresponding part and mark with if the paper has not been included in our survey.
  • If the paper is included in the survey, please replace the to the specific section. e.g., , and add other basic info about this paper, such as authors, conference.
  • If you think the paper does not belong in your section, please move it to another section with the tag.

Citations

Please consider citing our papers in your publications if the project helps your research. BibTeX reference is as follows.

@misc{dong2022survey,
      title={A Survey for In-context Learning}, 
      author={Qingxiu Dong and Lei Li and Damai Dai and Ce Zheng and Zhiyong Wu and Baobao Chang and Xu Sun and Jingjing Xu and Lei Li and Zhifang Sui},
      year={2022},
      eprint={2301.00234},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}