mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

How to know how each layer is used?

Open yukiarimo opened this issue 8 months ago • 2 comments

Hello! If I send a raw text into the model like "Hi there", how to know which layer in percentage is used to determine the most responsible one for this prompt (would like a single script, NOT USING TRANSFORMERLENS)?! Thanks.

yukiarimo avatar Apr 17 '25 16:04 yukiarimo

Mergekit doesn't have a direct prompts per-layer analysis; But this might help:

  • positive_prompts: from mergekit-moe is closest; it allows creating MoE models where different experts handle specific types of prompts:
    https://github.com/arcee-ai/mergekit/blob/93b7693a6940afa7f45ef9f676098747b5883fa4/docs/moe.md?plain=1#L19-L20

  • As for dense... it doesn't directly use prompts but you can try mergekit-evolve of docs/evolve.md to test different merges and find the best parameters by evaluates:
    https://github.com/arcee-ai/mergekit/blob/93b7693a6940afa7f45ef9f676098747b5883fa4/docs/evolve.md?plain=1#L45-L46


@cg123 Can a method potentially use prompts to guide layer merging of dense models? For example, merge specialized math+coder models?

Katehuuh avatar May 05 '25 05:05 Katehuuh

Thanks! By the way, if I'll make a Gemma 3 12B to a MoE there will be no support of it (like in MLX/llama.cop), right? So, can't run :(

yukiarimo avatar May 05 '25 13:05 yukiarimo

Did fine-tuning (LoRA, no way for full, unfortunately).

yukiarimo avatar Jun 28 '25 05:06 yukiarimo