wanghaiming

Results 6 issues of wanghaiming

操作系统:Centos7.9 GPU:A100 40GB CUDA: 11.2 Python 3.10.12 命令如下: `torchrun --nproc_per_node=1 train_qlora.py --train_args_file train_args/qlora/baichuan-7b-sft-qlora.json` 基础模型:https://huggingface.co/baichuan-inc/Baichuan-7B 数据集: https://huggingface.co/datasets/YeungNLP/moss-003-sft-data 训练参数如下: ``` { "output_dir": "output/pdmi-baichuan-7b", "model_name_or_path": "../Baichuan-7B", "train_file": "./data/moss-003-sft-data.jsonl", "num_train_epochs": 1, "per_device_train_batch_size": 8, "gradient_accumulation_steps":...

### Self Checks - [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general). - [X] I have searched for existing...

🐞 bug
🤔 cant-reproduce

训练的时候提示如下错误: ``` (venv) [xinjingjing@dev-gpu-node-09 InstructGLM]$ python train_lora.py \ > --dataset_path data/belle \ > --lora_rank 8 \ > --per_device_train_batch_size 2 \ > --gradient_accumulation_steps 1 \ > --max_steps 52000 \ > --save_steps...

``` python cover_belle2jsonl.py \ --data_path data/Belle_open_source_1M.json \ --save_path data/belle_data.jsonl ``` 执行以上报如下错误: ``` Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00