Kimi-K2 icon indicating copy to clipboard operation
Kimi-K2 copied to clipboard

Repetition issue: duplicated generations when testing Kimi-K2 with sglang

Open Ansel-Pu opened this issue 6 months ago • 3 comments

When testing the Kimi-K2 model through sglang, I encountered repeated generations in the output. My sampling parameter is temperature = 0.6, top=0.95 or 1.0.

"data": {"_input_text": "<|im_system|>system<|im_middle|>You are Kimi, an AI assistant created by Moonshot AI.<|im_end|><|im_user|>user<|im_middle|>Answer the following multiple choice question. The last line of your response should be of the following format: 'Answer: $LETTER' (without quotes) where LETTER is one of ABCD. Think step by step before answering.\n\nYou are studying a nuclear decay which converts two heavy nucleons of flavor A to another flavor B, while simultaneously emitting two much lighter particles E and V. In short, 2A -> 2B + 2E + 2V. It is known that the total energy spectrum of the outgoing E particles is continuous, with some endpoint value Q.\n\nA variant of this decay emits one exotic, massless particle M instead of the 2V. In this case, how does the total energy spectrum of the outgoing E particles compare to that of the original decay?\n\nA) The spectrum remains continuous with an adjusted shape, and the endpoint decreases.\nB) The spectrum becomes discrete, and the endpoint increases.\nC) The spectrum remains continuous with an adjusted shape, and the endpoint increases.\nD) The spectrum becomes discrete, and the endpoint decreases.<|im_end|><|im_assistant|>assistant<|im_middle|>", "_choices": ["A", "B", "C", "D"], "_gold_index": 2, "_gold_answer": ["C"]}, "meta_info": {"id": "7b165ac18f914547b2f77f3a50a8dedf", "finish_reason": {"type": "length", "length": 32768}, "prompt_tokens": 242, "completion_tokens": 32768, "cached_tokens": 0, "e2e_latency": 1600.6577031612396}, "output_text": "Let's break down this problem step by step.\n\n**Understanding the Original Decay: 2A -> 2B + 2E + 2V**\n\n1. **Energy Conservation:** The total energy released in the decay is shared among the outgoing particles (2B, 2E, 2V).\n2. **Continuous Spectrum:** The problem states that the total energy spectrum of the outgoing E particles is continuous. This implies that the energy is shared among *multiple* particles in the final state, and the energy of any single particle (like E) can vary. This is characteristic of a three-body (or more) decay. In this case, the decay is effectively 2A -> 2B + (2E + 2V). The \"2E + 2V\" system has a continuous energy distribution.\n3. **Endpoint Value Q:** This Q represents the maximum possible kinetic energy that the *combined* 2E particles can have. This maximum occurs when the 2B particles and the 2V particles carry away the minimum possible energy (i.e., their rest mass energy, if any, and zero kinetic energy).\n\n**Understanding the Variant Decay: 2A -> 2B + 2E + M**\n\n1. **Change in Final State:** The key difference is that the two massless V particles are replaced by a *single* massless M particle.\n2. **Energy Conservation (Variant):** The total energy released in the decay is now shared among 2B, 2E, and M.\n3. **Spectrum of E (Variant):**\n * **Discrete vs. Continuous:** Since M is massless, and we have 2B and 2E, this is still a multi-body decay (effectively 2A -> 2B + (2E + M)). The energy can still be distributed continuously among these particles. Therefore, the spectrum of the outgoing E particles will *remain continuous*. It will have an adjusted shape because the phase space for the decay has changed (replacing two massless particles with one massless particle).\n * **Endpoint Comparison:** Let's consider the maximum energy for the 2E particles.\n * **Original Decay (2A -> 2B + 2E + 2V):** The maximum kinetic energy for 2E occurs when 2B and 2V have minimum energy (zero kinetic energy). The total energy available for the *kinetic* energy of 2E and 2V (and 2B) is the mass difference between 2A and 2B, minus the rest mass of 2V (if any). Let's denote the total energy release as \\(E_{total}\\). The endpoint Q for 2E is when 2B and 2V take minimum energy.\n * **Variant Decay (2A -> 2B + 2E + M):** The maximum kinetic energy for 2E occurs when 2B and M take minimum energy. Since M is massless, its minimum energy is zero. The total energy available for the kinetic energy of 2E and M (and 2B) is the mass difference between 2A and 2B.\n * **Crucial Point:** In the original decay, the two V particles share energy. In the variant, a single M particle takes the energy. The *total* energy available for the *kinetic* energy of the *light* particles (E and V/M) is the same, determined by the mass difference between the initial and final heavy particles (2A and 2B).\n * However, the question asks about the *total energy spectrum of the outgoing E particles*. Let's rephrase: the maximum *kinetic* energy that the *combined* 2E particles can have.\n * Consider the energy available for the *light* particles. In the original decay, the energy is shared among 2E and 2V. In the variant, it's shared among 2E and M.\n * Let's think about the *maximum possible energy* for the 2E system.\n * Original decay: 2A -> 2B + (2E + 2V). The maximum energy for 2E occurs when 2B is at rest and 2V carries away the minimum possible energy (zero kinetic energy). The total energy released is \\(E_{release} = (m_{2A} - m_{2B} - m_{2V})c^2\\) (if 2V has rest mass) or simply the Q-value of the reaction. The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Variant decay: 2A -> As -> 2B + (2E + M).\n * Let's consider the total energy available for the *light* particles. In the original decay, the energy is shared among 2E and 2V. In the variant, it's shared among 2E and M.\n * The *total* energy released in the decay is fixed by the mass difference between 2A and 2B. Let's call this \\(E_{decay}\\).\n * Original: \\(E_{decay} = KE_{2B} + KE_{2E} + KE_{2V}\\). The endpoint Q for 2E is when \\(KE_{2B}\\) and \\(KE_{Some part of 2V}\\) are minimized.\n * Variant: \\(E_{decay} = KE_{2B} + KE_{2E} + KE_{M}\\).\n * The key is that in the original decay, the *two* V particles carry away energy. In the variant, a *single* M particle carries away energy.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Variant decay: 2A -> 2B + 2E + M.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\). The endpoint Q for 2E is the maximum kinetic energy the 2E system can have.\n * Let's consider the *maximum possible kinetic energy* for the *combined* As -> 2B + 2E + 2V.\n * Let's consider the *maximum possible kinetic energy* for the *combined* 2E particles.\n * Original decay: 2A -> 2B + 2E + 2V. The maximum energy for 2E occurs when 2B and 2V have minimum energy. The total energy available for the *light* particles (2E + 2V) is \\(E_{avail} = (m_{2A} - m_{2B})c^2\\).

Ansel-Pu avatar Aug 19 '25 01:08 Ansel-Pu

The repetition you're seeing in output is likely due to the temperature and Top P sampling parameters being set too low or an issue with the model's token generation process, known as a stuck-in-a-loop scenario. In some cases, with specific models and prompts, the model can enter a state where it generates the same token or a small sequence of tokens repeatedly.

Harshchafle avatar Aug 27 '25 03:08 Harshchafle

I have a similar issue even though I'm using temperature of 0.0 and top-p of 1.0 (i.e greedy decoding). Not sure why!

hammoudhasan avatar Aug 30 '25 12:08 hammoudhasan

I hava the same issue

acodercat avatar Sep 14 '25 09:09 acodercat