ChatGLM-Tuning issues

报错:Parameter at index 55 with name base_model.model.transformer.layers.27.attention.query_key_value.lora_B.default.weight has been marked as ready twice

5

我使用这个命令进行finetune. python -m torch.distributed.launch --use_env finetune.py \ --dataset_path data/answers \ --local_rank 32 \ --per_device_train_batch_size 6 \ --gradient_accumulation_steps 1 \ --max_steps 400 \ --save_steps 50 \ --save_total_limit 2 \ --learning_rate 1e-4...

19245222

你和官方的ptune区别在哪里？

官方已经有ptune机制了，你也出一个Tuning。那么你和他的不同在哪里呢？

19245222

如何自己生成alpaca_data.json 数据 alpaca_data.jsonl 是用来干嘛的

2

dragononly

抛出异常 No module named 'transformers_modules.'

11

在执行单元格： from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("../ChatGLM-6B/models/chatglm-6b, trust_remote_code=True) 抛出异常 No module named 'transformers_modules.' 当我切换transformers成4.26.1时执行如下代码异常: model.enable_input_require_grads() 抛出 enable_input_require_grads属性不存在因此我无法无法正常微调，希望能给出指示。

dayu1979

chore: two minor fixes

两处小修正 1. 去掉了`README`中`python tokenize_dataset_rows.py`参数的多余空格，多余空格使转义符转义到空格上了。 2. `cover_alpaca2jsonl.py`中`json.dumps`时，中文字符会被转义，生成的`jsonl`文件可读性略差，当然`json.loads`会转义回来不影响功能。 ``` >>> import json >>> print(json.dumps({ "intro": "测试"})) {"intro": "\u6d4b\u8bd5"} >>> print(json.dumps({ "intro": "测试" }, ensure_ascii=False)) {"intro": "测试"} >>> print(json.loads('{"intro": "\u6d4b\u8bd5"}')) {'intro': '测试'} >>>...

WindRunnerMax

infer报错

2

有大佬知道推理的时候这个咋解决嘛： ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized model. If you want to dispatch the...

super-wuliao

no module named 'torch._six'

是不是只支持torch1系列的？

darkwhale

显卡显存占用波动

1

您好！这是非常振奋人心的一个工作！我用这个项目在max_seq_length为200的情况下，能够正常在显存为45G的A40显卡上训练。但是用我自己的数据集，将max_seq_length调整为1024时，开始能正常训练，但是显存占用会上涨，最终显存不够，出现cuda-out-of-memory的bug。请问是什么问题导致训练过程中显存占用不稳定呢？

clannadcl