HJT9328

Results 6 issues of HJT9328

修改了config.json中的torch.dtype float32->float16 不生效,然后尝试在train.py中设置fp16:True 在查看checkpoint的时候发现模型也没有明显的大小变化,有什么好的方案来压缩模型大小吗?

question
wontfix

运行命令 RAY_memory_monitor_refresh_ms=0 CUDA_VISIBLE_DEVICES=2 swift infer \ --model_type chatglm2-6b \ --model_id_or_path /data/LLM_checkpoint/chatglm2-6b/chatglm2-6b \ --infer_backend vllm --tensor_parallel_size 1 报错,其中model_id_or_path是全参数微调的模型没有经过lora [INFO:swift] Due to `ckpt_dir` being `None`, `load_args_from_ckpt_dir` is set to `False`. Traceback (most...

question

部署了qwen1.5-72B的模型,测试流式首字延时大概在1.6s,通过什么参数能够减少首字延时呢,求大神

### Required prerequisites - [X] I have read the documentation . - [X] I have searched the [Issue Tracker](https://github.com/baichuan-inc/baichuan-7B/issues) and [Discussions](https://github.com/baichuan-inc/baichuan-7B/discussions) that this hasn't already been reported. (+1 or comment...

question