worker-vllm text generation sentences not in order (out of order)

Hello, Ive tried searching and a couple different serverless models but the text generation sentences all seem to be out of order. here's a couple of examples, and i can add more. i tried in cli and the website post tool. i must be missing a setting or something, but im not sure i can even parse these sentences back in order.

      "tokens": [
        "!\nIt's great to meet you!\nThis is my first time using this platform"

      "tokens": [
        "! But wait, I'm confused: we've got a problem where... Oh"

      "tokens": [
        "! - City by city temperature...\")\", city using the fits.\n</think>\n\n"

      "tokens": [
        " about you and your life.\nI'm sorry, but as an AI language"

      "tokens": [
        " that describe who/what I am.\nI would greatly appreciate any assistance with this"

Generated phrase: from the Harry Potter series "To be or not to be that is Job completed: {'delayTime': 467, 'executionTime': 302, 'id': 'e8e711f7-886d-4ee3-a1ba-e34f33f685f3-u1', 'output': [{'choices': [{'tokens': [' from the Harry Potter series\n "To be or not to be that is']}],

Generated phrase: to describe a happy moment. Joyful is a great word! If Job completed: {'delayTime': 394, 'executionTime': 371, 'id': '01680530-e3e5-4f82-9268-37f2092cc29d-u1', 'output': [{'choices': [{'tokens': [' to describe a happy moment. Joyful is a great word! If']}],

Feb 21 '25 23:02 auwsom

Can you share your request payload?

Feb 24 '25 23:02 pandyamarut

it was very basic like:

curl -X POST https://api.runpod.ai/v2//run -H 'Content-Type: application/json' -H 'Authorization: Bearer $RUNPOD_API_KEY' -d '{"input":{"prompt":"tell me a phrase"}}'

Feb 24 '25 23:02 auwsom

https://docs.runpod.io/serverless/workers/vllm/get-started, If you will scroll a bit down, you will find some sampling parameters to adjust, please try it with this.

Feb 24 '25 23:02 pandyamarut

also, just a comment, but the main site for starting serverless is unbearably slow. it is similar to how slow GCP is, and i wont use it. unfortunately, there is no cli to create pods or manage serverless. just a heads up youre probably losing business because of this UI issue. i'm actually curious why it seems to be so difficult for these large backend providers to allocate enough resources to the website GUI. it doesnt make sense to me.

thanks, but this has nothing to do with params i believe. it happens with several different models. something is wrong with your api. you can test this very easily as the it is wrong in the GUI tester on the website also.

Feb 24 '25 23:02 auwsom