Easy-Transformer icon indicating copy to clipboard operation
Easy-Transformer copied to clipboard

[Proposal] BERT: Future work

Open rusheb opened this issue 1 year ago • 2 comments

Proposal

This ticket documents enhancements to BERT support in TransformerLens, following on from issue #258, which were out of scope for the MVP (PR #276).

These items can be prioritised and spun into separate tickets as necessary.

  • [ ] Expand the demo notebook (demos/BERT.ipynb). The notebook should include
    • a runthrough of the BERT architecture and what makes it different to GPT
      • how the components are different, e.g. TokenType embedding, Post-LayerNorm
      • the masked language modelling (MLM) task and how it differs from causal language modelling
      • notes about loss, why it doesn't make sense for HookedEncoder.forward to return the loss
    • examples of using HookedEncoder to do the same types of things people would do with HookedTransformer, highlighting similarities and differences
  • [ ] User-test BERT support by using it to do proper interpretability research
    • [ ] Add examples of research using BERT to the demo notebooks
  • [ ] Add support for different tasks, e.g. Next Sentence Prediction, Causal Language Modelling
  • [ ] Add more models: bert-base-uncased, bert-large-cased, bert-large-uncased
  • [ ] Add preprocessing of weights including LayerNorm folding
  • [ ] Accept strings as input and add tokenization helpers from HookedTransformer
  • [ ] Add support for training/finetuning (most notably, dropouts)
  • [ ] Add tests for HookedEncoder convenience properties (e.g. W_U, b_u, W_E, etc)

Checklist

  • [x] I have checked that there is no similar issue in the repo (required)

rusheb avatar May 19 '23 09:05 rusheb

@rusheb what's the status on this?

I'd love to contribute to this..

timsankara avatar Sep 13 '23 08:09 timsankara

Hi Tim, I think the work is up for grabs if you are keen. Probably worth checking with @jbloomAus what his priorities are and spinning out a ticket for anything you decide to work on.

I'm happy to discuss specifics if that would be helpful.

rusheb avatar Sep 13 '23 14:09 rusheb