Chinese-LLaMA-Alpaca
Chinese-LLaMA-Alpaca copied to clipboard
alpaca-13b 合并后运行,没有生成内容
按照文档合并出 chinese-llama-13b 能正常输出生成内容,CPU 就是慢了点。
但是 alpaca-13b 出了问题。
!python ./merge_llama_with_chinese_lora_to_hf.py \
--base_model './checkpoint/llama-13b-hf' \ # pt 格式转换出来的
--lora_model './checkpoint/chinese-alpaca-lora-13b' \ # https://huggingface.co/ziqingyang/chinese-alpaca-lora-13b
--output_dir './outputs/chinese-alpaca-13b'
from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
import torch
MODEL_PATH = './outputs/chinese-alpaca-13b'
device = 'cpu'
tokenizer = LlamaTokenizer.from_pretrained(MODEL_PATH)
model = LlamaForCausalLM.from_pretrained(
MODEL_PATH,
load_in_8bit=False,
device_map={"": device},
)
# model.eval()
instruction = '中国的首都是哪里?并且做个300字介绍,包括一道特色菜和一个景点。'
formatted_template = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n'
inputs = tokenizer(formatted_template, return_tensors="pt")
input_ids = inputs["input_ids"].to(device)
generation_output = model.generate(
input_ids=input_ids,
generation_config=GenerationConfig(
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
max_new_tokens=128,
stream_output=False,
),
return_dict_in_generate=True,
output_scores=True,
)
s = generation_output.sequences[0]
output = tokenizer.decode(s).replace(formatted_template, '')
print(output)
output:
<s> </s>
以上流程运行,都没有报错误。就是得不到结果。
能不能帮忙 review 一下 哪个步骤出了问题?
output 那里没有 replace 的话,print(tokenizer.decode(s)) 完整是这样的:
<s> Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
中国的首都是哪里?并且做个300字介绍,包括一道特色菜和一个景点。
### Response:
</s>
generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id试试?
按上述就可以了。 我看看 GenerationConfig 的文档去。
@shuiiiiiimu 你好,可以给一下你的tansformer和tokenizer的版本吗
@shuiiiiiimu 你好,可以给一下你的tansformer和tokenizer的版本吗
!pip install git+https://github.com/huggingface/transformers.git !pip install git+https://github.com/huggingface/peft.git !pip install sentencepiece
我昨天安装的。最新分支上的代码。
@shuiiiiiimu 好的,谢谢,我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id,还是只有
@shuiiiiiimu 好的,谢谢,我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id,还是只有
我按照上述的参数设置之后,就能生成了。但是多跑几次之后。生成空白的次数居多(就是,同样的参数,重复跑,有时候有内容输出,有时候没有)。 不知道为啥。
@shuiiiiiimu 好的,我这边也研究研究
@shuiiiiiimu 好的,谢谢,我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id,还是只有
我按照上述的参数设置之后,就能生成了。但是多跑几次之后。生成空白的次数居多(就是,同样的参数,重复跑,有时候有内容输出,有时候没有)。 不知道为啥。
可以参考scripts/inference_hf.py里的prompt模板。我们测试发现用那个效果会好一些。
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
大家解决了吗?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.