eigenLiu comments

Results 56 comments of


                                            eigenLiu

trafficstars

微调后怎么转回原来基座的格式

我的天，为什么一定要包一层呢？麻烦给一个模型转回去到基座的方法，下面这种很难接受啊。 CUDA_VISIBLE_DEVICES=1 swift deploy --model_type qwen1half-7b-chat --model_cache_dir /data/ssd/LLM_models/qwen/Qwen1.5-7B-Chat --infer_backend vllm --use_flash_attn true --host 0.0.0.0 --port 8000 --max_new_tokens 512 --temperature 0.3 --top_p 0.7 --repetition_penalty 1.0

微调后怎么转回原来基座的格式

请问这是我需要的嘛？ LoRA fine-tuned: CUDA_VISIBLE_DEVICES=0 swift export \ --ckpt_dir xxx/checkpoint-xxx --load_dataset_config true \ --quant_method awq --quant_bits 4 \ --merge_lora true \

[Usage]: How to offload some layers to CPU？

https://github.com/vllm-project/vllm/issues/3563

[Usage]: How to offload some layers to CPU？

https://github.com/vllm-project/vllm/issues/627

[Usage]: How to offload some layers to CPU？

https://github.com/bd-iaas-us/vllm/pull/1

[Usage]: How to offload some layers to CPU？

https://github.com/bd-iaas-us/vllm/issues/3

[Usage]: How to offload some layers to CPU？

it's not a good idea to use cpu mem since vllm is for inference accelerate . there is a trade-off choice that if we can cut some weight to fit...

mlp_only_layers is more flexible than decoder_sparse_step

@amyeroberts hi 1.do i need continuously merg latest modifications？ 2. do i need fix all the ci errors?

mlp_only_layers is more flexible than decoder_sparse_step

HI @amyeroberts , thks 4 your guidance i force pushed back, and fixed some workflow ci errors. see my "Files changed", it's the minimum modification.

mlp_only_layers is more flexible than decoder_sparse_step

@amyeroberts @ArthurZucker now all ci passed, pls review my code and merg, thks again to @amyeroberts for the instructions. i explain here again that this pr will not affect any...