server icon indicating copy to clipboard operation
server copied to clipboard

How to set request id in Generate?

Open Missmiaom opened this issue 1 year ago • 3 comments

This doesn't seem to work

/v2/models/ensemble/generate

{
    "text_input": "...",
    "parameters": {
        "id": "123",
    },
    "sequence_id": "456"
}

verbose log:

image

Missmiaom avatar Mar 08 '24 07:03 Missmiaom

Hi @Missmiaom,

Can you please provide full reproduction steps (all commands run) and fill out the bug template below? Thank you! Description A clear and concise description of what the bug is.

Triton Information What version of Triton are you using?

Are you using the Triton container or did you build it yourself?

To Reproduce Steps to reproduce the behavior.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).

Expected behavior A clear and concise description of what you expected to happen.

yinggeh avatar Mar 08 '24 21:03 yinggeh

Triton Information What version of Triton are you using? 23.10

Are you using the Triton container or did you build it yourself? container

To Reproduce Steps to reproduce the behavior.

When I use the ensemble model and call it through the /v2/models/ensemble/generate interface, the request id I passed cannot be printed in the built-in verbose log of trtion.

Expected behavior A clear and concise description of what you expected to happen.

Triton's built-in verbose log can print request id

@yinggeh

Missmiaom avatar Mar 11 '24 08:03 Missmiaom

Hi @Missmiaom . Thanks for waiting. I have opened the ticket DLIS-6456 for our engineers to investigate.

yinggeh avatar Apr 10 '24 22:04 yinggeh

@yinggeh This feature already done?

dafu-wu avatar Jul 08 '24 11:07 dafu-wu

@dafu-wu PR https://github.com/triton-inference-server/server/pull/7392 is currently under review. Thanks for your patience.

yinggeh avatar Jul 09 '24 18:07 yinggeh

PR https://github.com/triton-inference-server/server/pull/7392 merged. Reopen for further questions.

yinggeh avatar Jul 15 '24 21:07 yinggeh

@yinggeh Thanks for your updated, I use the latest trition server image 24.07 to build the tensorrtllm backend, but the log still has some problem: Screenshot2024_08_15_145514

request body: '{"id": "11111", "sequence_id": "456", "text_input": "What is machine learning?", "max_tokens": 20, "bad_words": "", "stop_words": "", "pad_id": 2, "end_id": 2, "top_p":1, "top_k":1, "temperature":0.7}'

Is there something wrong with the way I use it? @shreyas-samsung can you help us explain it?

dafu-wu avatar Aug 15 '24 06:08 dafu-wu

@dafu-wu Looks like PR https://github.com/triton-inference-server/server/pull/7392 missed the deadline of 24.07 release. Could you try building the latest or wait for the 24.08 image?

yinggeh avatar Aug 15 '24 09:08 yinggeh

@yinggeh Do you know which dockerfile is the dockerfile of the official image?

dafu-wu avatar Aug 15 '24 09:08 dafu-wu

@dafu-wu Thanks for your patience. Are you referring to Dockerfile.* in server repo?

yinggeh avatar Aug 18 '24 05:08 yinggeh