vidur
vidur copied to clipboard
In a multi-round conversation, what does the trace of the n-th round represent?
Does the num_prefill_tokens for round n correspond solely to the prompt of that round, or does it include all prompts from rounds 1 to n and the outputs from rounds 1 to n–1? If it includes the full prompt and output history, could it easily exceed the request length threshold?