Chinese-LLaMA-Alpaca alpaca-13b 合并后运行，没有生成内容

按照文档合并出 chinese-llama-13b 能正常输出生成内容，CPU 就是慢了点。

但是 alpaca-13b 出了问题。

!python ./merge_llama_with_chinese_lora_to_hf.py \
    --base_model './checkpoint/llama-13b-hf' \ # pt 格式转换出来的
    --lora_model './checkpoint/chinese-alpaca-lora-13b' \  #  https://huggingface.co/ziqingyang/chinese-alpaca-lora-13b
    --output_dir './outputs/chinese-alpaca-13b'

from transformers import GenerationConfig, LlamaForCausalLM, LlamaTokenizer
import torch

MODEL_PATH = './outputs/chinese-alpaca-13b'
device = 'cpu'

tokenizer = LlamaTokenizer.from_pretrained(MODEL_PATH)
model = LlamaForCausalLM.from_pretrained(
    MODEL_PATH, 
    load_in_8bit=False,
    device_map={"": device},
)
# model.eval()

instruction = '中国的首都是哪里？并且做个300字介绍，包括一道特色菜和一个景点。'
formatted_template = f'Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:\n'
inputs = tokenizer(formatted_template, return_tensors="pt")
input_ids = inputs["input_ids"].to(device)
generation_output = model.generate(
    input_ids=input_ids,
    generation_config=GenerationConfig(
        temperature=0.1,
        top_p=0.75,
        top_k=40,
        num_beams=4,
        max_new_tokens=128,
        stream_output=False,
    ),
    return_dict_in_generate=True,
    output_scores=True,
)
s = generation_output.sequences[0]
output = tokenizer.decode(s).replace(formatted_template, '')
print(output)

output:

<s> </s>

以上流程运行，都没有报错误。就是得不到结果。

能不能帮忙 review 一下哪个步骤出了问题？

Apr 12 '23 02:04 shuiiiiiimu

output 那里没有 replace 的话，print(tokenizer.decode(s)) 完整是这样的：

<s> Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
中国的首都是哪里？并且做个300字介绍，包括一道特色菜和一个景点。

### Response:
</s>

Apr 12 '23 02:04 shuiiiiiimu

generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id试试？

Apr 12 '23 03:04 airaria

按上述就可以了。我看看 GenerationConfig 的文档去。

Apr 12 '23 03:04 shuiiiiiimu

@shuiiiiiimu 你好，可以给一下你的tansformer和tokenizer的版本吗

Apr 13 '23 07:04 world2025

@shuiiiiiimu 你好，可以给一下你的tansformer和tokenizer的版本吗

!pip install git+https://github.com/huggingface/transformers.git !pip install git+https://github.com/huggingface/peft.git !pip install sentencepiece

我昨天安装的。最新分支上的代码。

Apr 13 '23 08:04 shuiiiiiimu

@shuiiiiiimu 好的，谢谢，我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id，还是只有

Apr 13 '23 08:04 world2025

@shuiiiiiimu 好的，谢谢，我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id，还是只有

我按照上述的参数设置之后，就能生成了。但是多跑几次之后。生成空白的次数居多（就是，同样的参数，重复跑，有时候有内容输出，有时候没有）。不知道为啥。

Apr 13 '23 08:04 shuiiiiiimu

@shuiiiiiimu 好的，我这边也研究研究

Apr 13 '23 08:04 world2025

@shuiiiiiimu 好的，谢谢，我发现按照generate的参数用temperature=0.7, top_p=0.95, do_sample=True, num_beams=1, eos_token_id = tokenizer.eos_token_id，还是只有

我按照上述的参数设置之后，就能生成了。但是多跑几次之后。生成空白的次数居多（就是，同样的参数，重复跑，有时候有内容输出，有时候没有）。不知道为啥。

可以参考scripts/inference_hf.py里的prompt模板。我们测试发现用那个效果会好一些。

Apr 13 '23 09:04 airaria

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

Apr 21 '23 00:04 github-actions[bot]

大家解决了吗？

Apr 25 '23 13:04 ljch2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

May 21 '23 22:05 github-actions[bot]

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.

May 24 '23 22:05 github-actions[bot]

Chinese-LLaMA-Alpaca Chinese-LLaMA-Alpaca copied to clipboard

alpaca-13b 合并后运行，没有生成内容

Chinese-LLaMA-Alpaca
Chinese-LLaMA-Alpaca copied to clipboard