transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Add HHEMv2 Model

Open Miaoranmmm opened this issue 1 year ago • 1 comments

This PR adds the support of codes for the HHEMv2 model. For information about HHEMv2 , please visit model card cc @forrestbao

Miaoranmmm avatar Aug 22 '24 03:08 Miaoranmmm

Hey! from scanning through the code I am not sure I understand what is the difference with T5 here? 😓 If you want to add the model the best is to isolate the changes copied from t5 (wrap with. # Copied from) from the specificities of this model!

Hi @ArthurZucker, thanks for your feedback. We have removed the duplicate content modeling_hhemv2.py. Please review the updated version.

Miaoranmmm avatar Aug 26 '24 14:08 Miaoranmmm

hi @ArthurZucker There are two differences here:

  1. We hacked the T5ForTokenClassification for sequence classification. We padded a <pad> token at the beginning of the sequence in lieu of the [CLS] token and use the hidden state from the <pad> token to predict the final label. In this way, we only use the encoder part of T5.
  2. We also added the prompt template and tokenization into the class to save users the hassle. Now users can send pair of text and get the hallucination score -- while under the hood the prompting and tokenization is done automatically for them.

forrestbao avatar Aug 28 '24 15:08 forrestbao

Hey!

  1. Changing the token should not induce changes in the modeling code. However if you change the way the forward pass (Not the token or the id being used) then alright!
  2. This is not really something we want to have in the model itlesf. The chat template should be in the documentation and directly saved in the model that is pushed to the hub! If you wait a bit for #33248 to be merged will be a lot better to use this approach!

ArthurZucker avatar Sep 06 '24 11:09 ArthurZucker