GXKIM
GXKIM
> PT 阶段需要 GB 量级数据,SFT 只需要几万条数据。 现在是不能直接传参--finetuning_type qlora把 我看只有lora
> You mean 1.5 comsumes more memory than 1? I did test the inference costs, see the doc here https://qwen.readthedocs.io/en/latest/benchmark/hf_infer.html . Maybe I should add a training costs to you...
> Context length might be a matter. Are you using the official script or Llama factory or Axolotl? The Llama-factory framework uses normal VRAM for LoRA, but when using the...
> > > Context length might be a matter. Are you using the official script or Llama factory or Axolotl?上下文长度可能是一个问题。你使用的是官方脚本还是骆驼工厂或蝾螈? > > > > > > The Llama-factory framework uses...
我遇到执行train_sft.py的时候遇到了同样的问题
> 没试过直接load预量化好的int8的权重,但理论上有问题的话改下加载方式就可以 作者您好,要是使用执行finetune.py设置参数int8的话,对应的模型是不是也需要chatglm-6b-int8 这个才行呢
> (glm-130b) ➜ GLM-130B git:(main) ✗ bash scripts/evaluate.sh tasks/bloom/glue_cola.yaml WARNING:torch.distributed.run: > > Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded,...
> > 问题:torchrun找不到 是因为torch cuda版本问题嘛?大佬指教 > > 你是在 v100 机器上运行的吗? v100 需要安装 torchrun > > `pip install bminf` A100
> > 问题:torchrun找不到 是因为torch cuda版本问题嘛?大佬指教 > > 你是在 v100 机器上运行的吗? v100 需要安装 torchrun > > `pip install bminf` 执行脚本的时候报错,我看脚本里确实使用了torch run scripts/generate.sh 这个脚本
> cpu内存256G,GPU 6张3090 > > WARNING:torch.distributed.run: > > Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the...