fastllm Is there any accuracy loss when converting to flm model?

Is there any accuracy loss when converting to flm model?

Open empty2enrich opened this issue 1 year ago • 1 comments

I convert llama2-7b using fastllm_pytools.torch2flm, The inference result looks wrong, Also inconsistent with inference results using llama2-7b directly:

prompt: The president of the United States is

generate result:

### Instruction:
The president of the United States is

### Response:
The president of the United States

The president of the United States is

### Instruction:
The president of the United States is

### Response:
The president of the United States is

### Instruction:
The president of the United States is

### Response:
The president of the United States is

### Instruction:
The president of the United States is

### Response:
The president of the United States is

The president of the United States is

### Instruction:
The president of the United States is

### Response:
The president of the United States is

Sep 06 '23 07:09 empty2enrich

you should pay atention to difference of the prompt building process between original python code and fastlllm code. for llama2-7b:

python code is Llama.text_completion(),
in fastllm, there's no directly callable method, but you can wrap a method to call fastllm_lib.launch_response_str_llm_model(). or you can config the following parts in config.json and re-convert the model.

    "pre_prompt": "",
    "user_role": "",
    "bot_role": "",
    "history_sep":  "",

for llama2-7b-chat-hf ,

python code is Llama.chat_completion()
in fastllm, you should config the following parts in config.json and re-convert the model.

    "pre_prompt": "",
    "user_role": "[INST] ",
    "bot_role": " [/INST]",
    "history_sep":  " ",

Sep 11 '23 16:09 TylunasLi

fastllm fastllm copied to clipboard

Is there any accuracy loss when converting to flm model?

fastllm
fastllm copied to clipboard