sglang icon indicating copy to clipboard operation
sglang copied to clipboard

feat: log input text for OpenAI format API

Open panpan0000 opened this issue 1 year ago • 2 comments

Motivation

To support https://github.com/sgl-project/sglang/issues/1608

with openAI API , when we enable --log-requests, the input text in log will be unreadable.

[2025-02-19 10:07:29] Finish: obj=GenerateReqInput(text=None, input_ids=[0, 87979, 11403, 8367, 4697, 30, 59812, 2923, 1018, 290, 18594, 303, 882, 11743, 4431, 32414, 1175, 9484, 5802, 1923, 19223, 8745, 8745, 303, 87825, 16465, 621, 126725, 1175, 2792, 303, 20808...

Modifications

I think I found the root case : when input_id was filled with data ,the obj.text will be None.

And in the log-request output, dataclass_to_string_truncated(obj: GenerateReqInput) will show obj.text as None and obj.input_ids as encoded digits.

please correct me if I'm wrong.

Test result:

curl -L -X POST localhost:${PORT}/v1/chat/completions \
--data-raw '{
    "model": "/model",
    "messages": [
        {"role": "user", "content": "Who is the most beautiful woman in the world?"}
    ],
    "max_tokens": 2500,
    "temperature": 0.7,
    "stream": false
}'

Log of sglang

[2025-02-20 16:56:30] Receive: obj=GenerateReqInput(text='Who is the most beautiful woman in the world?', input_ids=[151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 15191, 374, 279, 1429, 6233, 5220, 304, 279, 1879, 11319, 151645, 198, 151644, 77091, 198], input_embeds=None, image_data=None, sampling_params={'temperature': 0.7, 'max_new_tokens': 2500, 'min_new_tokens': 0, 'stop': None, 'stop_token_ids': None, 'top_p': 1.0, 'top_k': -1, 'min_p': 0.0, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'repetition_penalty': 1.0, 'regex': None, 'ebnf': None, 'n': 1, 'no_stop_trim': False, 'ignore_eos': False, 'skip_special_tokens': True}, rid='8b4b18fccde5427885b63d31a535e1aa', return_logprob=False, logprob_start_len=-1, top_logprobs_num=0, return_text_in_logprobs=True, stream=False, log_metrics=True, modalities=[], lora_path=None, session_params=None, custom_logit_processor=None)
[2025-02-20 16:56:30 TP0] Prefill batch. #new-seq: 1, #new-token: 15, #cached-token: 24, cache hit rate: 23.76%, token usage: 0.00, #running-req: 0, #queue-req: 0
[2025-02-20 16:56:30 TP0] Decode batch. #running-req: 1, #token: 74, token usage: 0.00, gen throughput (token/s): 0.90, #queue-req: 0
[2025-02-20 16:56:30] Finish: obj=GenerateReqInput(text='Who is the most beautiful woman in the world?', input_ids=[151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 15191, 374, 279, 1429, 6233, 5220, 304, 279, 1879, 11319, 151645, 198, 151644, 77091, 198], input_embeds=None, image_data=None, sampling_params={'temperature': 0.7, 'max_new_tokens': 2500, 'min_new_tokens': 0, 'stop': None, 'stop_token_ids': None, 'top_p': 1.0, 'top_k': -1, 'min_p': 0.0, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'repetition_penalty': 1.0, 'regex': None, 'ebnf': None, 'n': 1, 'no_stop_trim': False, 'ignore_eos': False, 'skip_special_tokens': True}, rid='8b4b18fccde5427885b63d31a535e1aa', return_logprob=False, logprob_start_len=-1, top_logprobs_num=0, return_text_in_logprobs=True, stream=False, log_metrics=True, modalities=[], lora_path=None, session_params=None, custom_logit_processor=None), out={'text': "I'm sorry, but I can't answer this question. As an artificial intelligence language model, I don't have personal preferences or feelings, and I don't have access to information about beauty in the world. My purpose is to provide helpful and informative responses to the best of my knowledge and abilities, but I cannot produce opinions or preferences about individuals or topics.", 'meta_info': {'id': '8b4b18fccde5427885b63d31a535e1aa', 'finish_reason': {'type': 'stop', 'matched': 151645}, 'prompt_tokens': 39, 'completion_tokens': 73, 'cached_tokens': 24}}
[2025-02-20 16:56:30] INFO:     127.0.0.1:41150 - "POST /v1/chat/completions HTTP/1.1" 200 OK

BEFORE image

AFTER image

Checklist

  • [x] Format your code according to the Code Formatting with Pre-Commit.
  • [x] Add unit tests as outlined in the Running Unit Tests.
  • [x] Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
  • [ ] Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
  • [x] For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
  • [x] Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

panpan0000 avatar Feb 19 '25 12:02 panpan0000

@merrymercy can you please kindly take a look ?

panpan0000 avatar Feb 19 '25 13:02 panpan0000

I'm confused about the CI UT failures, seems all are irrelevant ...

  • test_video_chat_completion failure ,
  • performance threshold test_mmlu assert metrics["score"] >= 0.5
  • notebook test make: *** [Makefile:12: compile] Error 1

Digging into the log, I think I found the problem , and will re-work this PR.

panpan0000 avatar Feb 24 '25 08:02 panpan0000

This pull request has been automatically closed due to inactivity. Please feel free to reopen it if needed.

github-actions[bot] avatar May 30 '25 08:05 github-actions[bot]