qwjaskzxl comments

Results 13 comments of


qwjaskzxl

模型实测推理结果与汇报的差异很大

> 你仔细看过example目录下的readme了么😂？至于差距大或不大这个我也很难去给出判断，毕竟不是客观可以度量的东西。我们只能确定用我们给定的方法可以给出可比的效果。嗯嗯，因为我尝试几次都无法得到demo水平的回答，所以想确认是用[inference_hf.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py)里的参数跑出来的结果吗？（我也怀疑是不是自己的权重有问题

模型实测推理结果与汇报的差异很大

> > > 你仔细看过example目录下的readme了么😂？至于差距大或不大这个我也很难去给出判断，毕竟不是客观可以度量的东西。我们只能确定用我们给定的方法可以给出可比的效果。 > > > > > > 嗯嗯，因为我尝试几次都无法得到demo水平的回答，所以想确认是用[inference_hf.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py)里的参数跑出来的结果吗？（我也怀疑是不是自己的权重有问题 > > 是用llama.cpp的结果，参数在[wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/llama.cpp%E9%87%8F%E5%8C%96%E9%83%A8%E7%BD%B2)里列了： > > ``` > -c 2048 --temp 0.2 -n 256 --repeat_penalty 1.1 > ```...

qwjaskzxl

模型实测推理结果与汇报的差异很大

模型实测推理结果与汇报的差异很大

为什么感觉13B大模型，在AutoTokenizer比LlamaTokenizer的性能好很多，但是速度慢很多

training on v100

Sharing training log of 7B model on A6000 x 4

[Bug] Training on 13B causes loss to be 0, while 7B works fine

deepspeed inference 显存问题

Signal 7 error while finetuning with deepspeed

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0

ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -9) local_rank: 0