QAnything 是否考虑支持 cpu 推理

是否考虑支持 cpu 推理

Open ihacku opened this issue 11 months ago • 5 comments

一些场景比如工单 bot 回复不需要实时回复，内存够的情况下，cpu 推理慢一点能出结果也可以

Mar 22 '24 13:03 ihacku

https://github.com/Ma-Dan/QAnything/blob/cpu/%E6%9C%AC%E5%9C%B0CPU%E9%83%A8%E7%BD%B2%E5%92%8C%E8%B0%83%E8%AF%95%E6%96%B9%E6%B3%95.txt 试试我这个方法，milvus和mysql还是docker运行，3个模型和前后端服务都在本地了

Mar 23 '24 02:03 Ma-Dan

You can run llamafile(GGUF) on your Windows PC or other OS ,It's support CPU infer and provide an OpenAI Compatibility API。

You should modify this file (QAnything/tree/master/qanything_kernel/connector/llm /llm_for_online.py) to your local llm server endpoint。

Mar 25 '24 02:03 dubeno

我试了一上午就切换回GPU了

Mar 26 '24 02:03 Ma-Dan

we are now support python only environment installation, here are documents: https://github.com/netease-youdao/QAnything?tab=readme-ov-file#installationpure-python-environment

python only installation support Mac and CPU only machine.

please try again!

Apr 08 '24 07:04 successren

QAnything QAnything copied to clipboard

是否考虑支持 cpu 推理

QAnything
QAnything copied to clipboard