左冯翊 comments

Repositories
Issues
Comments

Results 2 comments of


                                            左冯翊

[Question]:The chat module is slow in generating responses using the knowledge base.

> The main sources of high "time-to-first-token" latency in RAGFlow are typically the retrieval and query refinement stages, especially when using large embedding models like Qwen3-Embedding-8B and agent workflows. These...

[Question]: Deployment of RAGFlow GPU Version on Hygon Computing Card Cluster

> GPU has nothing to do with RAGFlow. You could deploy embedding inference service on GPU which accelerates indexing and searching procedure.GPU 和 RAGFlow 没关系。你可以在 GPU 上部署嵌入推理服务，加快索引和搜索过程。 @KevinHuSh I’ve just...