chenhao
chenhao
可以去掉么?
如何运行支持多GPU部署? 确实会报错 ``` Traceback (most recent call last): File "web_demo.py", line 6, in model = load_model_on_gpus("THUDM/chatglm-6b", num_gpus=4) File "/data/chenhao/codes/ChatGLM-6B/chatglm_parallel.py", line 34, in load_model_on_gpus model = load_checkpoint_and_dispatch( File "/data/chenhao/anaconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/accelerate/big_modeling.py", line 479,...
@ChuangLee 您可以运行成功吗?
> > 加载量化后的int4模型会报错:  > > `model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True)` `model.save_pretrained(“./multi_gpus”,max_shard_size='2GB')` 先用python运行上面两行代码,在运行webui就行了,模型路径填 _**“./multi_gpus”**_ 这样确实可以跑起来,但是有出现了新问题 确实是4张卡 错误信息 代码 界面上没有应答
> > I think that when the file_id filter is not added, the message with a score of 0.77 should also be returned > > It looks like it is...
``` { "result": { "status": "green", "optimizer_status": "ok", "vectors_count": 664740, "indexed_vectors_count": 651483, "points_count": 664740, "segments_count": 6, "config": { "params": { "vectors": { "size": 768, "distance": "Cosine" }, "shard_number": 1, "replication_factor":...
Is it due to the mismatch between points.count and indexed_vectors_count?
Frequent reproduction
> Could you check if the same issue appears when using `exact=true` while searching? That would eliminate the approximate nature of vector search, and should give back exact results. Using...