SLED
SLED copied to clipboard
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
Results
2
SLED issues
Sort by
recently updated
recently updated
newest added
The implementation applies a second torch.topk when computing m_i^(n), while Algorithm 1 in the paper defines m_i^(n) over all i_k which is the top-k from the final layer. Could you...
Upgrade the implementation to transformer4.46.3 to support latest LLMs, including llama3 family, gemma, mistral, deepseek...