codellama icon indicating copy to clipboard operation
codellama copied to clipboard

Give an explicit example of the instruction prompt structure in the readme

Open marco-ve opened this issue 2 years ago • 4 comments

Currently the readme is pointing newcomers to generation.py, where they have to deduce the correct prompt structure for the instruction model from this code:

B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"

[...]

for dialog in dialogs:
    unsafe_requests.append(
        any([tag in msg["content"] for tag in SPECIAL_TAGS for msg in dialog])
    )
    if dialog[0]["role"] == "system":
        dialog = [
            {
                "role": dialog[1]["role"],
                "content": B_SYS
                + dialog[0]["content"]
                + E_SYS
                + dialog[1]["content"],
            }
        ] + dialog[2:]
    assert all([msg["role"] == "user" for msg in dialog[::2]]) and all(
        [msg["role"] == "assistant" for msg in dialog[1::2]]
    ), (
        "model only supports 'system', 'user' and 'assistant' roles, "
        "starting with 'system', then 'user' and alternating (u/a/u/a/u...)"
    )
    dialog_tokens: List[int] = sum(
        [
            self.tokenizer.encode(
                f"{B_INST} {(prompt['content']).strip()} {E_INST} {(answer['content']).strip()} ",
                bos=True,
                eos=True,
            )
            for prompt, answer in zip(
                dialog[::2],
                dialog[1::2],
            )
        ],
        [],
    )
    assert (
        dialog[-1]["role"] == "user"
    ), f"Last message must be from user, got {dialog[-1]['role']}"
    dialog_tokens += self.tokenizer.encode(
        f"{B_INST} {(dialog[-1]['content']).strip()} {E_INST}",
        bos=True,
        eos=False,
    )
    prompt_tokens.append(dialog_tokens)

This seems unnecessarily obscure. Is there a specific reason to not just give an example?

marco-ve avatar Aug 25 '23 12:08 marco-ve

https://huggingface.co/blog/codellama#conversational-instructions

alphastrata avatar Aug 25 '23 22:08 alphastrata

I know the structure. I was pointing out that the readme is not linking to it for no apparent reason.

marco-ve avatar Aug 28 '23 07:08 marco-ve

I know the structure. I was pointing out that the readme is not linking to it for no apparent reason.

'fe I've noticed a growing trend in the mega corps only keeping their HF doccos up to date lately, perhaps it's by design?

alphastrata avatar Aug 28 '23 08:08 alphastrata

I second that. Although I do understand that providing examples is of course at low priority for such big project, pointing to a clear, intuitive example or prompt template could be super helpful (e.g., I like how https://github.com/Stability-AI/StableLM/tree/main#quickstart is doing this).

zjysteven avatar Oct 14 '23 21:10 zjysteven