Sheng Liu
Sheng Liu
> The model can be re-used within the machine. We typically recommend creating a Predictor per thread. So if you are re-loading the model multiple times, you can avoid it...
> @oreo-yum > > The `predictor` for PyTorch is pretty light-weighted. Only the model takes significant memory. You can use a global `Predictor` (assume you use PyTorch) per JVM if...
@skeretna what is the working langchain version? I update my langchain package and get the same fail, but it works before the updating. what's sad is I couldn't change the...
@XReyRobert I guess https://github.com/hyperonym/basaran might be such a project. It directly works on huggingface models.
@jstzwj Thanks for the fix, happy to see it in next version.
That would be great! Vicuna is the only self-hosted model, I've tried, which could execute complex prompts within langchain with very few changes. If we have `OpenVicuna`, there would be...
> > > > What do you mean? OpenLlaMA is an open source model trained on the RedPajama dataset. You mean fine tune it using the same method Vicuna was...
Thanks to the openai API, it's quite easy to integrate with langchain. A little difference is langchain could use mulit str as 'stop' param, while in fschat it's a single...
@paolorechia That's great! We could select embeddings based on https://huggingface.co/spaces/mteb/leaderboard. As the OpenAI model text-embedding-ada-002 is also listed, we could find some open source models with similar performance, by ranking...