Open-Assistant
Open-Assistant copied to clipboard
Web Team Meeting: 2023-02-07
Agenda:
- Review what's changed
- TBD
- How to resolve wrong language submissions
- Live Inference Timeline?
- moderation / dealing with spam
- #1138
- #1126
- users potentially copy pasting stuff from chat gpt
- Different axes for labeling
- #1151
- #1107
- #1106
- #1105
- #1108
- proper signup page to avoid ToS problems
- migrate frontend users to python backend
- #1189
- #1190 for total prompts / trees / etc.
Notes
- Triaging Feature Requests
- Core team to prioritize highest priority goals (live inference, chat, etc)
- Triage requests and add
nice-to-havetag and promote to broader community
- Other Team Status
- Inference:
- Prototype exists
- HuggingFace has new infra for this.
- LangChain has some related support
- Keith needs to add JWT handling to auth protect server
- Need to double check if there's an easy to use Docker config to standup stack (Answer is probably no)
- Data:
- Andreas to spend more time on Data and Inference
- ML:
- Still some isolated work on RL and fractured models.
- Figuring out pipeline and coordinated approach to produce usable models
ralliohas some very viable models
- Inference:
- Live Chat UI Sketch
- V0: Can chat with model(s) directly, can rate or rank results. No history. Maybe can submit as new task for other evaluation.
- V1: Retain full chat histories that are stored and can be accessed in the future. Can reset or resume as desired.
- Let the backend side figure out how to handle storing and retaining user chat sessions.
- API Interaction:
- end point to start a chat -> Returns Chat ID. Can be re-used.
- end point to get results of a chat -> returns chat messages.
- end point to post a message to a chat -> Adds new message and returns response.
- missing: end point to get list of existing chats -> Should return list of chat IDs.
- Inference Scaling
- Federation Strategy: make it easy for people to run a model on their server and tie in their hosting to an inference engine we host.
- Donations: Would need a lot.
- Catalog of Issues
- Answer not saved after long time writing: message tree is closed most likely
- On Submit, wait for response and make no page changes on failure to allow for re-submit.
- Explain why answer couldn't be submitted (task finished, answer too long).
- Backend could store result but wouldn't be included in ranking phase.
- Report timeline to respond (5 minutes).
- Answer not saved after long time writing: message tree is closed most likely
- Moderation Activities
- Have a lot of trees that are halted due to reporting. Could be re-activated if they look decent.
- Need more UI features that make it easy to explore flagged messages and trees and make it easy to re-activate them.
- Need to implement troll board feature request for user moderation (to check on issue assignee)
- Messages View
- Should (could) it be separated by language?
- Is it showing in proper chronological order?
- feature idea: Add ability to translate messages into your locale.
- Language Issues
- Change flagging button to say "Not in {language}" based on the message information.
- Could update the free text tasks to report what language the response will be submitted in and maybe support changing that before submission.
- Different axes for labeling
- Supporting these require quite a bit of backend changes. How feasible is this?
- Theory: In general any taxonomy we pick will be imperfect. Right now we should at most simplify the set of tags we're gathering.
- For spelling issues: at a later date make it easy to browse data to fix small issues or create moderator role to allow a second tier of editing.
- Adding languages is easy, our only general concern is splitting the user base away.
- Deleting User Options
- Purge User: literally purges everything user is associated with and their messages. Highly destructive. Not well tested.
- Delete User: Remove user's ID and account information, de-associates user id from submitted messages.
- Will re-asses next week.