Chat needs an ability to attach images/files
Implementation
(Thinh's note)
Backend (2):
newapi routePOST /filesto upload files, input: file, logic: create new docs users>uid>files{id, name, thumbnail, mime_type, openai_file_id}direct upload to openai(/files), output: {id, name, thumbnail,}POST /messageswith new body paramfile_ids: [str], logic: chat with files use openai https://platform.openai.com/docs/api-reference/threads , use the best openai model o1 (or gpt-4o if o1 is not work with file yet)
App (4):
- Chat > Message box: add options to attach photos, take photo, attach files.
- Chat > Upload photo,files to
/files, before submitting messages - Chat > Submit message to
/messageswith the new body fieldfile_ids - Chat > Message list: render the message with attachment(photo, file)
Be notice:
the current chat feature: ensure the new chat works seamlessly with the current chat feature.
keep the UI simple: we can use openai app as the standard product.
thread and end-thread option: the best implementation is we could detect if users are asking a question that needs the context from file(what file) or not. btw, if it's too complicated at this time, so let's go with either:
- having an option to end thread
- or just use Clear chat to force end thread
Maybe useful
- Chat with files example: https://github.com/BasedHardware/omi/issues/1617#issuecomment-2567425580
#1573
Implementation
Backend (2):
newapi routePOST /filesto upload files, input: file, logic: create new docs users>uid>files{id, name, thumbnail, mime_type, openai_file_id}direct upload to openai(/files), output: {id, name, thumbnail,}POST /messageswith new body paramfile_ids: [str], logic: chat with files use openai https://platform.openai.com/docs/api-reference/threads , use the best openai model o1 (or gpt-4o if o1 is not work with file yet)
App (4):
- Chat > Message box: add options to attach photos, take photo, attach files.
- Chat > Upload photo,files to
/files, before submitting messages - Chat > Submit message to
/messageswith the new body fieldfile_ids - Chat > Message list: render the message with attachment(photo, file)
Be notice:
the current chat feature: ensure the new chat works seamlessly with the current chat feature.
keep the UI simple: we can use openai app as the standard product.
thread and end-thread option: the best implementation is we could detect if users are asking a question that needs the context from file(what file) or not. btw, if it's too complicated at this time, so let's go with either:
- having an option to end thread
- or just use Clear chat to force end thread
import os
from dotenv import load_dotenv
import openai
class FileChat:
def __init__(self):
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
self.thread = None
self.file_id = None
self.assistant = None
def load_document(self, file_path):
"""Upload a document to OpenAI and create a thread"""
# Upload the file to OpenAI
with open(file_path, 'rb') as file:
response = openai.files.create(
file=file,
purpose='assistants'
)
self.file_id = response.id
# Create an assistant with file search capability
self.assistant = openai.beta.assistants.create(
name="File Reader",
instructions="You are a helpful assistant that answers questions about the provided file. Use the file_search tool to search the file contents when needed.",
model="gpt-4o",
tools=[{"type": "file_search"}]
)
# Create a thread and attach the file
self.thread = openai.beta.threads.create()
openai.beta.threads.messages.create(
thread_id=self.thread.id,
role="user",
content="Please help me answer questions about the attached file.",
attachments=[{
"file_id": self.file_id,
"tools": [{"type": "file_search"}]
}]
)
def ask(self, question):
"""Ask a question about the loaded document"""
if not self.thread or not self.file_id:
return "Please load a document first using load_document(file_path)"
# Add the question to the thread
openai.beta.threads.messages.create(
thread_id=self.thread.id,
role="user",
content=question
)
# Create a run with the assistant
run = openai.beta.threads.runs.create(
thread_id=self.thread.id,
assistant_id=self.assistant.id
)
# Wait for the response
while True:
run_status = openai.beta.threads.runs.retrieve(
thread_id=self.thread.id,
run_id=run.id
)
if run_status.status == 'completed':
break
# Get the messages
messages = openai.beta.threads.messages.list(
thread_id=self.thread.id
)
# Return the latest assistant response
return messages.data[0].content[0].text.value
def cleanup(self):
"""Clean up resources"""
if self.file_id:
# Delete the file from OpenAI
openai.files.delete(self.file_id)
self.file_id = None
if self.assistant:
# Delete the assistant
openai.beta.assistants.delete(self.assistant.id)
self.assistant = None
self.thread = None
def main():
# Initialize the chat system
chat = FileChat()
print("Welcome to File Chat!")
print("First, please provide the path to your text file.")
try:
while True:
file_path = input("\nEnter file path (or 'quit' to exit): ")
if file_path.lower() == 'quit':
break
try:
chat.load_document(file_path)
print(f"\nFile loaded successfully! You can now ask questions about {file_path}")
while True:
question = input("\nAsk a question (or 'new' for new file, 'quit' to exit): ")
if question.lower() == 'quit':
chat.cleanup()
return
elif question.lower() == 'new':
chat.cleanup()
break
answer = chat.ask(question)
print("\nAnswer:", answer)
except Exception as e:
print(f"Error: {str(e)}")
finally:
# Ensure cleanup happens even if there's an error
chat.cleanup()
if __name__ == "__main__":
main()
@mdmohsin7 man, pls read this ticket's description and feel free to ask me anything. if everything is clear and you're excited about this feature, drop your UI/UX proposal then go ahead.
@nquang29 said that he is also excited with this ticket so you can ask him if he can help on backend side or not / Quang's Discord @windtran_
Got it! As mentioned we can simply follow the UX of ChatGPT or even iMessage
Chat > Upload photo,files to /files, before submitting messages Chat > Submit message to /messages with the new body field file_ids
What if we upload the file right after the user selects it? Similar to how the ChatGPT app does it
So if I understand correctly, @nquang29 will be working on the backend and I'll have to make the app side changes?
@beastoin
yes i mean uploading right after selecting the image / file.
use our figma and draft the design pls sir
you can do both, or just ask Quang to see if he could help so that we can speed up the progress.
@mdmohsin7
The designs in our figma are very old and are not the ones that are being followed currently. I'll quickly code the design without the functionality and will share the image with you
Alright I'll message Quang on discord
@beastoin
Progress:
Are we going to allow multiple file uploads?
multiple file uploads - yes
at the time you use figma, your mind focuses completely on design (ui/ux) - not code. that's the reason why if you want to create great ux, you need to draft your ideas somewhere - away from your code editor.
@mdmohsin7
What is the max limit on the number of files? And also any max limit on the file size?
Since we don't have the current UI designs in Figma, it would have taken more time to design the new UI so I just went with code itself for now. Pls check the video in #1629, that should give you an idea of how the UI will look. The app side part is almost done (will have to modify it a bit to support multiple files), just need to connect to the backend
@beastoin
just follow what chatgpt did
@mdmohsin7 ^
just follow what chatgpt did
ChatGPT only allows 3-4 files on free plan
I've asked Quang on discord for help with backend, he's interested it seems and waiting for him response
@mdmohsin7 could you share the latest demo video here so that we can get feedback easier ?
@mdmohsin7 @nquang29 we should finish the feature in the next 3 days, so i will be making some slightly pushes. be ready pls :)
@beastoin latest demo video (pls excuse my slow internet)
https://github.com/user-attachments/assets/8349752a-6825-402a-a322-37a0a74e4167
Deploy plan
- [x] Create new gcp bucket for chat thumbnails, public read
<x>_chat_files - [x] Set new env var BUCKET_CHAT_FILES=
<x>_chat_files to backend - [x] Create Firestore index on
messagescollection; fields:chat_session_id Ascending, deleted Ascending, plugin_id Ascending, created_at Descending, __name__/ creation link - [x] Deploy backend / https://github.com/BasedHardware/omi/actions/runs/13353001675
- [x] Deploy app https://github.com/BasedHardware/omi/releases/tag/v1.0.54%2B223-mobile-cm
product change logs
- the feature is ready on Testflight / Internal Test
please keep monitoring and improving the feature closely over the next 3 weeks
congratulation @mdmohsin7 @nquang29 @kodjima33 🚀
Doesn't work @mdmohsin7
@beastoin poor review
Doesn't work @mdmohsin7
@beastoin poor review
https://github.com/user-attachments/assets/eb929b8e-d8f0-4e4d-8be7-ff694d9c43b7
Didn't do the backend changes, will test once again if the frontend is missing something from its side
@nquang29 please push the fixes 🌚
product change logs
- the fixes are ready on testflight/internal test 🚀
@nquang29 pls check it / @mdmohsin7 @kodjima33 fyi ~
product changes logs
- reverted ↘️
cause, the bad fixes #1866 :
It will navigate the users to the file chat anytime they ask: "What's this?"
@nquang29 pls fix it :)