rome icon indicating copy to clipboard operation
rome copied to clipboard

Generating weights for efk/mend for new model

Open salemohamedo opened this issue 3 years ago • 1 comments

Hi, I was wondering if you guys can tell me how I can generate weights for distilgpt2 for the mend/efk baselines, similar to what you have for gpt2-xl here: https://rome.baulab.info/data/weights/. I'm trying to run these baselines but don't have the saved weights. I tried simply loading and saving huggingface's weights for distilgpt2 but it looks like the code is looking for something a bit different. If you guys have a script/suggestions, that would be great.

Thanks!

salemohamedo avatar Apr 17 '22 17:04 salemohamedo

Hi @salemohamedo, you'll want to refer to the MEND repository for more information.

They have instructions for training a MEND baseline for a variety of GPT models; I don't recall whether DistilGPT is supported out-of-the-box, but I'm sure Eric can help you with this if not! After training a model, simply place the trained .pt model in baselines/mend/weights. You can consult mend_main.py for details on naming conventions, so that the code registers the new model file.

Let me know if you have further questions!

kmeng01 avatar Apr 27 '22 23:04 kmeng01