fastllm
fastllm copied to clipboard
Is there any accuracy loss when converting to flm model?
I convert llama2-7b using fastllm_pytools.torch2flm
, The inference result looks wrong, Also inconsistent with inference results using llama2-7b directly:
prompt: The president of the United States is
generate result:
### Instruction:
The president of the United States is
### Response:
The president of the United States
The president of the United States is
### Instruction:
The president of the United States is
### Response:
The president of the United States is
### Instruction:
The president of the United States is
### Response:
The president of the United States is
### Instruction:
The president of the United States is
### Response:
The president of the United States is
The president of the United States is
### Instruction:
The president of the United States is
### Response:
The president of the United States is
you should pay atention to difference of the prompt building process between original python code and fastlllm code.
for llama2-7b
:
- python code is Llama.text_completion(),
- in fastllm, there's no directly callable method, but you can wrap a method to call
fastllm_lib.launch_response_str_llm_model()
. or you can config the following parts inconfig.json
and re-convert the model.
"pre_prompt": "",
"user_role": "",
"bot_role": "",
"history_sep": "",
for llama2-7b-chat-hf
,
- python code is Llama.chat_completion()
- in fastllm, you should config the following parts in
config.json
and re-convert the model.
"pre_prompt": "",
"user_role": "[INST] ",
"bot_role": " [/INST]",
"history_sep": " ",