chenhao

Results 19 comments of chenhao

如何运行支持多GPU部署? 确实会报错 ``` Traceback (most recent call last): File "web_demo.py", line 6, in model = load_model_on_gpus("THUDM/chatglm-6b", num_gpus=4) File "/data/chenhao/codes/ChatGLM-6B/chatglm_parallel.py", line 34, in load_model_on_gpus model = load_checkpoint_and_dispatch( File "/data/chenhao/anaconda3/envs/ChatGLM-6B/lib/python3.8/site-packages/accelerate/big_modeling.py", line 479,...

> > 加载量化后的int4模型会报错: ![image](https://user-images.githubusercontent.com/46914203/227116839-efcae0ad-430a-4ca4-8fd1-630734da8ce6.png) > > `model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True)` `model.save_pretrained(“./multi_gpus”,max_shard_size='2GB')` 先用python运行上面两行代码,在运行webui就行了,模型路径填 _**“./multi_gpus”**_ 这样确实可以跑起来,但是有出现了新问题 确实是4张卡 错误信息 代码 界面上没有应答

> > I think that when the file_id filter is not added, the message with a score of 0.77 should also be returned > > It looks like it is...

``` { "result": { "status": "green", "optimizer_status": "ok", "vectors_count": 664740, "indexed_vectors_count": 651483, "points_count": 664740, "segments_count": 6, "config": { "params": { "vectors": { "size": 768, "distance": "Cosine" }, "shard_number": 1, "replication_factor":...

Is it due to the mismatch between points.count and indexed_vectors_count?

> Could you check if the same issue appears when using `exact=true` while searching? That would eliminate the approximate nature of vector search, and should give back exact results. Using...