Weird bug where malformed API request causes model to analyze error message
I know this is contrived but I had it happen just now and wanted to report it as a bug.
I ran dllama-api in one window (used llama3_2_3b_instruct_q40).
Then in another window I used netcat (nc) to connect into port 9990.
I manually typed:
"POST /v1/chat/completions HTTP/1.0"
(without the quotes) then hit enter, then it waited for the headers with the args I'd give, and I hit enter again. I expected an error, but it looks like some error was actually seen by the LLM (you can see the few GET commands that I tried by hand before the POST, then see the rest of the output.. this was in the dllama-api window:
⭕ socket[0]: connecting to 10.0.0.2:9998 worker ⭕ socket[0]: connected ⭕ socket[1]: connecting to 10.0.0.3:9998 worker ⭕ socket[1]: connected ⭕ socket[2]: connecting to 10.0.0.4:9998 worker ⭕ socket[2]: connected 💡 arch: llama 💡 hiddenAct: silu 💡 dim: 3072 💡 hiddenDim: 8192 💡 nLayers: 28 💡 nHeads: 24 💡 nKvHeads: 8 💡 vocabSize: 128256 💡 origSeqLen: 131072 💡 seqLen: 8192 💡 nSlices: 4 💡 ropeTheta: 500000.0 📄 bosId: 128000 📄 eosId: 128001 📄 chatEosId: 128009 🚧 Cannot allocate 1576009728 bytes directly in RAM 🕒 ropeCacheSize: 24576 kB ⏩ Loaded 3304476 kB Listening on 0.0.0.0:9990... ⭐ chat template: llama3 🛑 stop: <|eot_id|> Server URL: http://127.0.0.1:9990/v1/ 🔷 UNKNOWN 🔷 GET / 🔷 GET /v1/ 🔷 GET /v1/models 🔷 GET /v1/chat/completions 🔷 POST /v1/chat/completions 🔹<|start_header_id|>assistant<|end_header_id|>
🔸Problems with the current code
The current code has a few issues:
- Incorrect Usage of
insert()Method: Theinsert()method is used to insert a value into a collection, but it's not being used correctly. It should be called with a value and an index, not with a string and an index.- Missing Error Handling: The code doesn't handle any potential errors that might occur when using the
insert()method.- Missing Input Validation: The code doesn't validate the input values, which can lead to unexpected behavior.
Improved Code
Here's an improved version of the code that addresses the issues mentioned above:
def insert_value(): """ Inserts a value into a list at a specified index. """ # Get the list from the user lst = input("Enter the list: ") try: # Convert the list to a Python list lst = [x for x in lst.split() if x != ''] except ValueError: print("Invalid input. Please enter a list of values separated by spaces.") return # Get the index from the user try: index = int(input("Enter the index to insert the value: ")) except ValueError: print("Invalid input. Please enter a valid index.") return # Get the value to insert from the user value = input("Enter the value to insert: ") # Check if the index is valid if index < 0 or index > len(lst): print("Invalid index. Please enter a valid index.") return # Insert the value at the specified index lst.insert(index, value) # Print the updated list print("Updated list:", lst) if __name__ == "__main__": insert_value()Example Use Case
To use this code, simply run it and follow the prompts. For example:
Enter the list: 1 2 3 Enter the index to insert the value: 2 Enter the value to insert: 4 Updated list: [1, 2, 3, 4] ```<|eot_id|>🔶
While it was amusing, it might actually be used as an attack surface.. Wanted to mention it.
The output of my netcat (nc) command:
$ nc 127.0.0.1 9990 POST /v1/chat/completions HTTP/1.0
HTTP/1.1 200 OK Content-Type: application/json; charset=utf-8 Content-Length: 2242
{"choices":[{"finish_reason":"","index":-458634096,"message":{"content":"Problems with the current code\n\nThe current code has a few issues:\n\n1. Incorrect Usage of
insert()Method: Theinsert()method is used to insert a value into a collection, but it's not being used correctly. It should be called with a value and an index, not with a string and an index.\n2. Missing Error Handling: The code doesn't handle any potential errors that might occur when using theinsert()method.\n3. Missing Input Validation: The code doesn't validate the input values, which can lead to unexpected behavior.\n\nImproved Code\n\nHere's an improved version of the code that addresses the issues mentioned above:\n\npython\ndef insert_value():\n \"\"\"\n Inserts a value into a list at a specified index.\n \"\"\"\n # Get the list from the user\n lst = input(\"Enter the list: \")\n try:\n # Convert the list to a Python list\n lst = [x for x in lst.split() if x != '']\n except ValueError:\n print(\"Invalid input. Please enter a list of values separated by spaces.\")\n return\n\n # Get the index from the user\n try:\n index = int(input(\"Enter the index to insert the value: \"))\n except ValueError:\n print(\"Invalid input. Please enter a valid index.\")\n return\n\n # Get the value to insert from the user\n value = input(\"Enter the value to insert: \")\n\n # Check if the index is valid\n if index < 0 or index > len(lst):\n print(\"Invalid index. Please enter a valid index.\")\n return\n\n # Insert the value at the specified index\n lst.insert(index, value)\n\n # Print the updated list\n print(\"Updated list:\", lst)\n\n\nif __name__ == \"__main__\":\n insert_value()\n\n\nExample Use Case\n\nTo use this code, simply run it and follow the prompts. For example:\n\n\nEnter the list: 1 2 3\nEnter the index to insert the value: 2\nEnter the value to insert: 4\nUpdated list: [1, 2, 3, 4]\n","role":"assistant"}}],"created":1737518828,"id":"cmpl-j0","model":"Distributed Model","object":"chat.completion","usage":{"completion_tokens":454,"prompt_tokens":15,"total_tokens":469}}
Similarly if I do this curl command with no headers, no json message, etc, I get random German text:
$ curl -X POST http://10.0.0.1:9990/v1/chat/completions
That yielded:
🔸Mein Interesse an der folgenden Frage:
Wie kann ich ein dekoratives Präsentationspapier mit verschiedenen Mustern und Farben erstellen, das auch bei Vorträgen oder anderen Präsentationen ein guter Eindruck macht?
Meine Umgebung:
- Ich habe ein Computer mit einer 24-Zoll
In case you're curious, google translate says:
My interest in the following question:
How can I create a decorative presentation paper with different patterns and colors that also makes a good impression during lectures or other presentations?
My environment:
- I have a computer with a 24-inch
My initial concern was that I thought some malformed request caused some error in dllama-api that was "seen" by the LLM, but now I'm thinking it might just be the model's random response to an empty query? So maybe this isn't a valid bug/issue after all? Either way, leaving it so it's known.. maybe we check for an empty or malformed prompt before doing inference on it?