LLM-Tuning
LLM-Tuning copied to clipboard
Tuning LLMs with no tears💦, sharing LLM-tools with love❤️.
Traceback (most recent call last): File "/usr/local/anaconda3/envs/test39/lib/python3.9/site-packages/sklearn/__check_build/__init__.py", line 45, in from ._check_build import check_build # noqa ImportError: dlopen: cannot load any more object with static TLS During handling of the...
Thanks for your work. I wanna ask a question why epoch in log is different from progress. I have used the command to run the lora tuning with 8 gpus....
按照readme里面的过程,自己设置了一个自我介绍的数据,一步步的Tokenization和lora训练loss都为很0.0001了,然后按照model= PeftModel.from_pretrained(model, "/home/llm/ChatGLM2-6B/finetuning/weights").half()加载微调后的,自我介绍还是没变(原厂的自我介绍)。 求大神解答思路或者大神的微调过程
看了一下GPU的使用率,是一个个跳100%的,你们有没有这种情况?
(tuning) [yons@Ubuntu 17:54:44] ~/work/tuning/LLM-Tuning $ python3 tokenize_dataset_rows.py --model_checkpoint /home/yons/work/glm/ChatGLM2-6B/THUDM/chatglm2-6b --input_file CMeiE-train.json --prompt_key q --target_key a --save_name simple_math_4op --max_seq_length 2000 --skip_overlength False Downloading and preparing dataset generator/default to file:///home/yons/.cache/huggingface/datasets/generator/default-35c7964d6cacead3/0.0.0... Traceback (most...
code lamma微调脚本可以使用baichuan2的吗
`def data_collator(features: list) -> dict: len_ids = [len(feature["input_ids"]) for feature in features] longest = max(len_ids) input_ids = [] labels_list = [] for ids_l, feature in sorted(zip(len_ids, features), key=lambda x: -x[0]):...
chatglmconditionalgeneeration object has no attribute hf_device_map 
感谢工作! 请问这里 ppo model 为什么要接一个valuehead 呢? https://github.com/beyondguo/LLM-Tuning/blob/ed68123815bc0add9ad2d7ddc2a48dc584db2c94/RLHF/rl_training.py#L185C1-L185C11 这个head好像随机初始化的?
chaglm-6b lora微调执行到指定的eval_step后提示“iteration over a 0-d tensor”,故障如下所示:  代码如下: `def train_v2(model, train_data, val_data): writer = SummaryWriter() world_size = int(os.environ.get("WORLD_SIZE", 1)) ddp = world_size != 1 train_args = TrainingArguments( output_dir=args.output_path, do_train=True, per_device_train_batch_size=4,...