Enhancement: Adding embedding and fine-tuning for training
Contact Details
No response
What features would you like to see added?
Implementing embedding and fine-tuning for training.
- https://platform.openai.com/docs/guides/embeddings
- https://platform.openai.com/docs/guides/fine-tuning
It's also mean file uploading to openai for training. From a backend setting. It may also be adding, uploading files from front end for user with drag n drop and conventional input file uploading.
More details
- https://platform.openai.com/docs/guides/embeddings
- https://platform.openai.com/docs/guides/fine-tuning
Which components are impacted by your request?
No response
Pictures
No response
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Thanks for the request. I agree I think this would be a really welcome feature. I'll keep this in mind as I integrate file support (retrieval augmented generation).
May be with langchain plugin or not. I think it's already exist: https://js.langchain.com/docs/modules/data_connection/text_embedding/ but i didn't find for Fine-Tunning.
May be as text, files and jsonL / json line. I do not know if Openai only accept text and jsonL? i though to create something to convert any files to text and any json to jsonL but not really sure.
Could the backend use a pip package to prepare the embeddings? I would vote for a local embedding model to keep the documents private and reduce costs. It might be a reliable and consistent alternative to asking the current model for the conversation title.
INSTRUCTOR (Instruction-based Omnifarious Representations) 👨🏫 "Embeddings tailored to any task" One Embedder, Any Task: Instruction-Finetuned Text Embeddings Also, this relates to File support: vector indexing & retrieval project item.
UPD: I just finished listening to the publication. I looked for models to find the "-large" model had ~8x more DLs last month but only 3x larger at 1.34 GB 194,913