DoLa issues

How to install the environment if I don't have root permission?

Hello, I'm following your work and trying to run your code. When I install the environment in my anaconda env with `pip install -e transformers-4.28.1` I meet the following problem:...

hummingbird2030

Replacing .sum() with .mean() in dola.py?

2

Hi Team, great work on the project! I noticed something in `dola.py` line 217 that might need attention. The code is: ```python log_probs = diff_logits[range(diff_logits.shape[0]), continue_ids].sum().item() ``` I am wondering...

wTsAI1b

Should apply model.norm layer to hidden_states[early_exit_layer] ?

1

https://github.com/voidism/DoLa/blob/dc88907406f9744f748f3c779f2353efd5bdc824/transformers-4.28.1/src/transformers/models/llama/modeling_llama.py#L703 I think you guys should apply model.norm layer to hidden_states[early_exit_layer] . Because only the last hidden_state applied model.norm layer. See https://github.com/voidism/DoLa/blob/dc88907406f9744f748f3c779f2353efd5bdc824/transformers-4.28.1/src/transformers/models/llama/modeling_llama.py#L594

githubhyz

Fix typo in README.md

automatcially -> automatically

eltociear

The way to calculate the log_probs

Hi, As the output of the model in each token's position represents the possibilities of next token, should the calculation of log_probs be misaligned. I mean "diff_logits[range(diff_logits.shape[0]-1), continue_ids[1:]].sum().item()" instead of...

laurenlong

Do you have the code for the visualization?

1

JiaxinQin0814

Questions of GPT-Judge

2

GPT-3 has been deprecated. What type of model should I use to fine-tune into a GPT-judge? Also, due to the change in the fine-tuning format, what changes should I make...

ker-02

Potential bug in JSD implementation in DOLA and ReDeep

I think there is a problem with the implementation of the Jensen-Shannon divergence in DoLa and a new hallucination detection method ReDeep. Here I described the problem: https://github.com/Jeryi-Sun/ReDEeP-ICLR/issues/2https://github.com/Jeryi-Sun/ReDEeP-ICLR/issues/2 The code...

MichalBrzozowski91

model support

1

Thanks for your great work! Can dola support more LLM model? Such as llama3.1, llama2, qwen2 or Mistral serious?

OliverLeeXZ

two questions about get_relative_top_filter?

Thank you for your excellent work. 1. I think [**"scores_normalized = scores.log_softmax(dim=-1)"**](https://github.com/voidism/DoLa/blob/11b73b74ec1a72216e3c97c587177d72d8288f8f/dola.py#L113C8-L113C56) is redundant because **“scores”** has already been passed through log_softmax (i.e., "final_logits = final_logits.log_softmax(dim=-1)"). 2. When you fix...

haiduo

DoLa
DoLa copied to clipboard

Metadata

How to install the environment if I don't have root permission?

Replacing .sum() with .mean() in dola.py?

Should apply model.norm layer to hidden_states[early_exit_layer] ?

Fix typo in README.md

The way to calculate the log_probs

Do you have the code for the visualization?

Questions of GPT-Judge

Potential bug in JSD implementation in DOLA and ReDeep

model support

two questions about get_relative_top_filter?

← Metadata

Owner

Metadata

DoLa DoLa copied to clipboard

Metadata

← Metadata

Owner

Metadata

DoLa
DoLa copied to clipboard