edit-distance-papers icon indicating copy to clipboard operation
edit-distance-papers copied to clipboard

A curated list of papers dedicated to edit-distance as objective function

Edit-distance as objective function

There are several research fields in which the edit-distance chosen as the objective function. For example, in Automatic Speech Recognition (ASR) the main metric of the quality of models is Word Error Rate (WER).


Unfortunately, directly optimize the edit-distance function is difficult. Therefore, in most cases, approaches based on a proxy function, like a cross-entropy. On the other hand, in the context of the sequence learning task this leads to several problems [1]:

  1. Exposure Bias: the model is never exposed to its own errors during training, and so the inferred histories at test-time do not resemble the gold training histories.

  2. Loss Evaluation Mismatch: training uses a word-level loss, while at test-time we target improving sequence-level evaluation metrics

  3. Label Bias: since word probabilities at each time-step are locally normalized, guaranteeing that successors of incorrect histories receive the same mass as do the successors of the true history.


The following table summarizes the works that attempts to solve the mentioned problems. There are much more detailed overview of works, for example [2], but this list includes only works that use the edit-distance explicitly or implicitly. Moreover, most of these works formalize the sequence prediction task as an action-taking problem in Reinforcement Learning.

Year Task Reward level Algorithms, Models Affiliation Authors, Link
2020 ASR Sentence MWER, RNN-T Amazon Guo et al.
2020 MT Sentence MGS, parameter search NYU Welleck, Cho
2020 ASR Sentence Proper Noun, Phonetic Fuzzing, MWER, RNN-T, LAS Google Peyser, Sainath, Pundak
2019 NLP Sentence GPT-2, PPO, Human labeling OpenAI Ziegler, Stiennon et al.
2019 ASR Sentence Neural Architecture Search, REINFORCE, CTC KPMG Nigeria, OAU Baruwa et al.
2019 ASR Sentence Normalized MWER Amazon Gandhe, Rastrow
2019 ASR Token MBR, RNN-T Tencent, USA Weng et al.
2019 ASR Token ECTC-DOCD China Yi, Wang, Xu
2019 ASR Sentence MWER, RNN-T, LAS Google Sainath, Pang et al
2019 MT Token Reinforce-NAT, Non-Autoregressive Transformer China, Tencent Shao, Feng et al.
2019 MT, TS, APE Token Levenshtein Transformer, imitation learning Facebook, New York Gu, Wang, Zhao
2018 ASR Token MBR, softmax margin, PAPB, S2S Brno, JHU, MERL Baskar et al.
2018 ASR Token OCD, S2S Google Brain Sabour, Chan, Norouzi
2018 ASR Token REINFORCE, S2S Nara, RIKEN Tjandra et al.
2018 TS Sentence Alternating Actor-Critic Hong Kong, Tencent Li, Bing, Lam
2018 ASR Sentence REINFORCE, PPO, Reward shaping Tokyo Peng, Shibata, Shinozaki
2017 ASR Sentence REINFORCE, Self-critic Salesforce Zhou, Xiong, Socher
2017 ASR Sentence MWER, LAS, Sampling, N-best Google Prabhavalkar et al.
2017 ASR Sentence Expected Loss, RNA Google Sak et al.
2017 MT Sentence Actor-Critic, Critic-aware Hong Kong, New York Gu, Cho, Li
2016 ASR Sentence Reward Augmented ML Google Brain Norouzi et al.
2016 MT Token Actor-Critic Montreal, McGill Bahdanau et al.
2015 MT Sentence MIXER Facebook Ranzato et al.
2015 ASR Token Task Loss Estimation Montreal, Wrocław Bahdanau et al.
2014 ASR Sentence Expected Loss, CTC DeepMind, Toronto Graves, Jaitly


  1. Sequence-to-Sequence Learning as Beam-Search Optimization
  2. Deep Reinforcement Learning for Sequence-to-Sequence Models