yanghu819
yanghu819
Hi, great work, where is the checkpoint
我试了单卡a100跑7B 依赖版本和官方一致 运行命令: CUDA_VISIBLE_DEVICES=0 /opt/anaconda3/envs/lavin/bin/torchrun --nproc_per_node 1 --master_port 11111 train.py \ --llm_model 7B\ --llama_model_path ../data/weights/ \ --data_path ../data/alpaca_data.json \ --max_seq_len 512 \ --batch_size 4 \ --accum_iter 8 \ --epochs 20...
请教下A100 40G配置下单卡的具体的参数配置。
A100 40G的结果,仍然达不到: [8737] {'acc_natural': '87.66', 'acc_social': '94.71', 'acc_language': '85.64', 'acc_has_text': '87.15', 'acc_has_image': '86.86', 'acc_no_context': '88.08', 'acc_grade_1_6': '89.79', 'acc_grade_7_12': '86.49', 'acc_average': '88.61'} torch等: torch 1.13.0+cu117 transformers 4.37.0.dev0 bitsandbytes 0.41.3.post2 具体环境 name:...
求官方来个eval code, 我自己调的要么输出重复要么停不下来
I find acc: 0.05 is due to my imcomplete training data, after using the right gsm8k, the result is a lot better, but still have some issues.   the...