eigenLiu

Results 56 comments of eigenLiu
trafficstars

我的天,为什么一定要包一层呢?麻烦给一个模型转回去到基座的方法,下面这种很难接受啊。 CUDA_VISIBLE_DEVICES=1 swift deploy --model_type qwen1half-7b-chat --model_cache_dir /data/ssd/LLM_models/qwen/Qwen1.5-7B-Chat --infer_backend vllm --use_flash_attn true --host 0.0.0.0 --port 8000 --max_new_tokens 512 --temperature 0.3 --top_p 0.7 --repetition_penalty 1.0

请问这是我需要的嘛? LoRA fine-tuned: CUDA_VISIBLE_DEVICES=0 swift export \ --ckpt_dir xxx/checkpoint-xxx --load_dataset_config true \ --quant_method awq --quant_bits 4 \ --merge_lora true \

https://github.com/vllm-project/vllm/issues/3563

https://github.com/vllm-project/vllm/issues/627

https://github.com/bd-iaas-us/vllm/pull/1

https://github.com/bd-iaas-us/vllm/issues/3

it's not a good idea to use cpu mem since vllm is for inference accelerate . there is a trade-off choice that if we can cut some weight to fit...

@amyeroberts hi 1.do i need continuously merg latest modifications? 2. do i need fix all the ci errors?

HI @amyeroberts , thks 4 your guidance i force pushed back, and fixed some workflow ci errors. see my "Files changed", it's the minimum modification.

@amyeroberts @ArthurZucker now all ci passed, pls review my code and merg, thks again to @amyeroberts for the instructions. i explain here again that this pr will not affect any...