swift
swift copied to clipboard
自我认知微调没有起作用
依据官方文档中的步骤对CodeQwen1.5-7B-Chat进行微调,流程全部跑通了,但是加载数据后并没有看出微调后的效果,甚至于自我认知微调都没有work
sft_args = SftArguments(
model_type=ModelType.codeqwen1half_7b_chat,
model_id_or_path="qwen/CodeQwen1.5-7B-Chat",
# dataset=[DatasetName.coig_cqia_chinese_traditional],
custom_train_dataset_path=["F:/AAA/train.json"],
train_dataset_sample=1000,
logging_steps=5,
max_length=2048,
learning_rate=5e-5,
warmup_ratio=0.4,
output_dir='output',
lora_target_modules=['ALL'],
self_cognition_sample=500,
dataloader_num_workers=0,
model_name=['小黄', 'Xiao Huang'],
model_author=['魔搭', 'ModelScope'])
train.json中只有1条数据,复制了100遍,训练完之后也没有work
# Experimental environment: 3090
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
from swift.llm import (
get_model_tokenizer, get_template, inference, ModelType, get_default_template_type,
)
from swift.utils import seed_everything
from swift.tuners import Swift
seed_everything(42)
ckpt_dir = './output/codeqwen1half-7b-chat/v3-20240422-152539/checkpoint-35'
model_type = ModelType.codeqwen1half_7b_chat
template_type = get_default_template_type(model_type)
model, tokenizer = get_model_tokenizer(model_type, model_kwargs={'device_map': 'auto'})
model.generation_config.max_new_tokens = 128
model = Swift.from_pretrained(model, ckpt_dir, inference_mode=True)
template = get_template(template_type, tokenizer)
query = '你是qwen吗?'
response, history = inference(model, template, query)
print(f'response: {response}')
print(f'history: {history}')