qwjaskzxl

Results 13 comments of qwjaskzxl

> 你仔细看过example目录下的readme了么😂? 至于差距大或不大这个我也很难去给出判断,毕竟不是客观可以度量的东西。 我们只能确定用我们给定的方法可以给出可比的效果。 嗯嗯,因为我尝试几次都无法得到demo水平的回答,所以想确认是用[inference_hf.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py)里的参数跑出来的结果吗?(我也怀疑是不是自己的权重有问题

> > > 你仔细看过example目录下的readme了么😂? 至于差距大或不大这个我也很难去给出判断,毕竟不是客观可以度量的东西。 我们只能确定用我们给定的方法可以给出可比的效果。 > > > > > > 嗯嗯,因为我尝试几次都无法得到demo水平的回答,所以想确认是用[inference_hf.py](https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference_hf.py)里的参数跑出来的结果吗?(我也怀疑是不是自己的权重有问题 > > 是用llama.cpp的结果,参数在[wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/llama.cpp%E9%87%8F%E5%8C%96%E9%83%A8%E7%BD%B2)里列了: > > ``` > -c 2048 --temp 0.2 -n 256 --repeat_penalty 1.1 > ```...

> I was able to train using deepspeed on 8 V100 GPUs. Here is the training script and deepseed config file. > > torchrun --nproc_per_node=8 --master_port=9776 train.py --model_name_or_path hf_model/llama-7b --data_path...

> > I also posted https://github.com/tatsu-lab/stanford_alpaca/files/11024692/Mar20_05-17-08_0c56f6779a08.csv csv log file! > > I takes approx. 24hours(a day) > > This is strange. You are using way better gpus than mine. As...

> Facing the same issue, the loss is 0. for the 13B model and extremely large (more than 10,000 after 0.5 epoch) for the 7B model on the cleaned dataset....

> > maybe because of CPU OOM ? > > Can you provide more details please? I watch my RAM reaching 256g/256g, then it ERROR occurs