EasyEdit Recommended ROME layers and v_loss_layer for Qwen3 series (0.6B, 1.7B, 4B, 8B)

Hi EasyEdit Team,

First, thank you for this fantastic library!

I am currently working on adding support for the new Qwen3 model series (0.6B, 1.7B, 4B, and 8B) to use with the ROME algorithm. I've been creating the necessary .yaml config files, but I'm unsure about the correct hyperparameters to set for layers and v_loss_layer.

I understand from the ROME paper that the optimal layers parameter is found using Causal Tracing. To avoid having to run this analysis myself, I was hoping you might have these "official" values from your own testing, similar to how Qwen2.5-7B-Instruct is set to layers: [5].

Here is a summary of the model architectures as I understand them. Could you please help me fill in or confirm the recommended values for the layers parameter?

Model	Total Layers	`v_loss_-layer` (My Guess)	Recommended `layers`
Qwen2.5-7B-Instruct	28	27	`[5]` (from repo)
Qwen3-0.6B	28	27	?
Qwen3-1.7B	28	27	?
Qwen3-4B	36	35	?
Qwen3-8B	36	35	?

For my Qwen3-8B.yaml file, I set v_loss_layer: 35 (since it has 36 layers) and took a guess with layers: [18] (the 50% midpoint). However, I'm not sure if this is optimal, or if I should follow the Qwen2.5 heuristic (layer 5/28 ≈ 18% depth) and choose a layer like [6] or [7] instead.

Could you please provide the recommended layers to use for these Qwen3 models for ROME?

Thank you for your help!

Oct 30 '25 08:10 Salehoof

Same, I'm working on the MEMIT & ROME methods on the Qwen3 series, hoping to get official hparams settings.

Nov 04 '25 05:11 gxx27

we will work on this, It’s EMNLP season recently.

Nov 04 '25 05:11 zxlzr

Thank you for your response! Another issue is that, seems the environment cannot recognize Qwen3 models (transformers version too old), hoping it can be fixed together.

Nov 04 '25 06:11 gxx27

Sorry for being late, Here are some solutions for the time being.

Use nnsight or transformer_lens to conduct causal tracing to decide the layer for updating. These are empirical results, so feel free to test.
You can update the transformer version on your own. It may conflict with the qformer section, but you can annotate the code if you do not need the multimodal part.

Nov 04 '25 07:11 littlefive5

Quick update:

I updated the requirements, and the current code can support the Qwen3 series code.
For the Qwen3‘s suggested layer. I write a code for casual tracing here: https://colab.research.google.com/drive/1ZFAtbDzSW3eK4tMhBwUyXtCPJG5K2FuH?usp=sharing

You can test this on your own data and avg the different to determine the targeted layer.

Nov 18 '25 09:11 littlefive5