mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Thera are still some problems with moe merge qwen with other LLM(like llama,deepseek,etc)

Open aoyinke opened this issue 7 months ago • 3 comments

Here is one piece of code In the file of mergekit/mergekit/moe/qwen.py

`for model_ref in ( [config.base_model] + [e.source_model for e in config.experts] + [e.source_model for e in (config.shared_experts or [])] ): model_cfg = model_ref.config(trust_remote_code=trust_remote_code) model_types.append(model_cfg.model_type)

    if len(set(model_types)) != 1:
        if explain:
            logging.warning(
                "Qwen MoE requires all input models to have the same architecture"
            )
        return False
    if model_types[0] not in ("llama", "mistral", "qwen2"):
        print("model_types[0]",model_types[0])
        if explain:
            logging.warning(
                "Qwen MoE requires all input models to be Qwen2, Llama or Mistral models"
            )
        return True

`

The question is how can I merge qwen2 with other LLM while len(set(model_types)) have equal to 1?

While I change "len(set(model_types)) != 1:" to "len(set(model_types)) != 2:", I can finally merge qwen2 with llama.

Here is my config.yaml

base_model: */models/Qwen2-7B architecture: qwen experts:

  • source_model: */models/CodeLlama-7b-hf positive_prompts:

    • "code"
  • source_model: */models/CodeLlama-7b-hf positive_prompts:

    • "python"

shared_experts:

  • source_model: /*/models/CodeLlama-7b-hf positive_prompts:
    • "programming"
    • "algorithm"

The documentation about how to merge qwen2 is too simple to use.Here are some notifications.

  1. Qwen MoE merge requires exactly one shared expert
  2. Qwen MoE requires the shared expert to have prompts
  3. Qwen MoE requires all input models to have the same architecture
  4. Qwen MoE requires all input models to be Qwen2, Llama or Mistral models
  5. The prompts of each expert can not be same.

aoyinke avatar Jul 03 '24 02:07 aoyinke