matrix_chatgpt_bot icon indicating copy to clipboard operation
matrix_chatgpt_bot copied to clipboard

Robots do not respond to messages with "!v"

Open mwnu opened this issue 1 year ago • 14 comments

"gpt_vision_api_endpoint": "https://xxxx/v1/chat/completions", "gpt_vision_model": "gpt-4-turbo",

"content": {
    "body": "!v Please explain the above diagram.",
    "format": "org.matrix.custom.html",
    "formatted_body": "!v Please explain the above diagram.",
    "msgtype": "m.text"
  }

No response, no logs.

mwnu avatar Apr 24 '24 12:04 mwnu

You should quote a image. Besides, gpt vision won't work in E2EE room. image image

hibobmaster avatar Apr 24 '24 12:04 hibobmaster

You should quote a image. Besides, gpt vision won't work in E2EE room. image image

I use @+!v, and it responds twice; the first time, it indicates that it doesn't know.

mwnu avatar Apr 24 '24 13:04 mwnu

Like thread level chatting for element android before, !v gpt vision command won't work... No m.mentions from element android cause a lot of inconvinient things. https://github.com/hibobmaster/matrix_chatgpt_bot/blob/90a00c9b5aa95cdc2e29f1b6eefe31ebcf3c1e69/src/bot.py#L378-L386

hibobmaster avatar Apr 24 '24 13:04 hibobmaster

Can you provide a screenshot?

hibobmaster avatar Apr 24 '24 13:04 hibobmaster

Can you provide a screenshot?

image

mwnu avatar Apr 24 '24 13:04 mwnu

I know what's wrong. When we mention the bot, it trigger thread chat as the same time. image

hibobmaster avatar Apr 24 '24 13:04 hibobmaster

I know what's wrong. When we mention the bot, it trigger thread chat as the same time. image

Perhaps the command "!v" can be omitted. Instead, different models could be invoked based on the event's mimetype, as some models, such as gpt-4-turbo and claude-3, support vision.

mwnu avatar Apr 24 '24 14:04 mwnu

With https://github.com/hibobmaster/matrix_chatgpt_bot/commit/81543d561b46df4158892324172b5145e44f0e32, mention bot

  • with image will trigger gpt vision
  • with plain text will trigger thread level chatting

Try image: hibobmaster/matrixchatgptbot:sha-81543d561b46df4158892324172b5145e44f0e32 image image

hibobmaster avatar Apr 24 '24 15:04 hibobmaster

sha-81543d561b46df4158892324172b5145e44f0e32

Robot can now recognize images in rooms without !v, but they cannot perform this function within threads. Additionally, commands like !pic, !help, and !lc are also unusable in threads, indicating that the two interaction modes are not well integrated. Of course, commands such as !gpt, !chat, and !new are unnecessary in threads. However, due to the poor compatibility of "io.element.thread" (Element PC's implementation for servers that do not support the Matrix standard threads) on client devices (it does not display on mobile phones), retaining these commands is still essential.

mwnu avatar Apr 25 '24 05:04 mwnu

With 81543d5, mention bot

  • with image will trigger gpt vision
  • with plain text will trigger thread level chatting

Try image: hibobmaster/matrixchatgptbot:sha-81543d561b46df4158892324172b5145e44f0e32 image image

Some of the robot's responses are displayed entirely in red font, while others are not. It appears that the <mx-reply> <blockquote> tags have been added in the HTML. Is this another way of implementing the reply function in Matrix? Very strange!

mwnu avatar Apr 25 '24 06:04 mwnu

https://github.com/hibobmaster/matrix_chatgpt_bot/commit/c5834db9b270181a9987aff05c311d6c698a3d49 Try image: hibobmaster/matrixchatgptbot:sha-c5834db9b270181a9987aff05c311d6c698a3d49 Screenshot_2024-04-26-01-31-19-448_im vector app-edit

hibobmaster avatar Apr 25 '24 17:04 hibobmaster

sha-c5834db9b270181a9987aff05c311d6c698a3d49

I tested it, and these commands execute successfully within threads. They seem to be independent of the thread context. For example, after uploading a picture, the bot cannot describe it directly. Instead, it requires a separate reply to reference it, and the bot's response is not considered part of the thread context. Of course, the current method is okay, almost like a thread within a thread 😀. This can reduce unnecessary context, which can be referenced separately when needed.

mwnu avatar Apr 25 '24 17:04 mwnu

the bot cannot describe it directly

GPT Vision has a prompt which should be provided by user since matrix doesn't support sending a image with description.

and the bot's response is not considered part of the thread context

Try image: hibobmaster/matrixchatgptbot:v1.7.2

image

hibobmaster avatar Apr 26 '24 10:04 hibobmaster

the bot cannot describe it directly

GPT Vision has a prompt which should be provided by user since matrix doesn't support sending a image with description.

and the bot's response is not considered part of the thread context

Try image: hibobmaster/matrixchatgptbot:v1.7.2

image

done👍

mwnu avatar Apr 26 '24 12:04 mwnu