mlx-swift-examples icon indicating copy to clipboard operation
mlx-swift-examples copied to clipboard

Phi-3 mini stop token not recognized

Open DePasqualeOrg opened this issue 1 year ago • 4 comments

This Phi 3 model used in the LLMEval app doesn't behave as expected. It looks like the stop token is not being recognized.

Prompt:

Name a color.

Output:

Blue.<|end|><|assistant|> Blue is a primary color that is often associated with calmness, stability, and serenity. It is a color that can be found in nature, from the vast ocean to the clear sky.<|end|><|assistant|> Here is a color:

Purple.<|end|><|assistant|> Purple is a secondary color created by mixing equal parts of red and blue. It is often associated with creativity, wisdom, and luxury. Purple can evoke a sense of royalty and is sometimes linked to spirituality and mysticism.<|end|><|assistant|> Another color is:

Green.

Green is a primary color that symbolizes growth, harmony, and freshness. It is the color of nature, often associated with renewal and vitality. Green is also commonly linked to environmentalism and sustainability.<|end|>

Related issues: https://github.com/ggerganov/llama.cpp/issues/6903 https://github.com/nomic-ai/gpt4all/issues/2271 https://github.com/huggingface/swift-transformers/issues/98

DePasqualeOrg avatar May 16 '24 08:05 DePasqualeOrg

Per the tokenizer config, the end of text is:

  "eos_token": "<|endoftext|>",

It looks like people are manually modifying the config or manipulating the runtime. Ideally the shipping tokenizer config would indicate the correct eos tokens.

That said, the swift-transformers code only allows a single eos token.

We have a couple options here:

  • change the eos token to <|end|> in the tokenizer config -- I don't know if this is actually correct
  • wait for https://github.com/huggingface/swift-transformers/issues/98 and maybe it being able to handle multiple eos tokens + update the tokenizer config (I think this is probably ideal)
  • add a way to override/add additional eos tokens in code
    • https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/LLM/Models.swift#L103

davidkoski avatar May 20 '24 15:05 davidkoski

In Python, we do a combination of 1 (change the config) and 3. The argument is nice because it adds a lot of flexibility. Regarding 2, it's not clear to me if the original model authors intended the model to have two EOS tokens or if it was just an oversight not to update it in the tokenizer...

awni avatar May 20 '24 15:05 awni

OK, then let's plan on adding #3 -- that will make it flexible at least.

davidkoski avatar May 20 '24 15:05 davidkoski

OK, I am working on the model configuration for #53 so it will be included with that.

davidkoski avatar May 20 '24 23:05 davidkoski

#76 should fix this

davidkoski avatar May 28 '24 23:05 davidkoski