Hao Zhang
Hao Zhang
@qZhang88 : Yes, your suggestion is reasonable and you can. Contributions are welcome.
I'll take a look and try this PR later this week.
@sfc-gh-aqiao watch this thread -- which influence chunk-prefill performance
yes, @sgsdxzy is right. Please re-open the issue if you still see it.
closing as this is not related with the development of the repo. Please try to figure out the appropriate hyperparameters on your own dataset. Some HPO is needed!
@lan2720 We currently do not support APIs, cuz that will cause too much stress on our server. @lan2720
it is not very difficult to allow the model to output embeddings. Maybe improve [this part of code ](https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/model_worker.py#L151) and expose a fastAPI endpoint to pass embeddings? Contributions are welcome.
Contributions are welcome. Feel free to submit a PR and ping me for review
Supported in #663. Closing. Please try and let us know your feedback. A comparison report between vicuna embedding and sentence transformer embedding would be appreciated.
@rjiang-ptm since we have agreed (in email) to proceed with Option 3. maybe we can start the implementation (w/ MBZUAI team)?