Iz Beltagy comments

Results 38 comments of


Iz Beltagy

Context embedding shows anomaly, independent of sentence and token

Can you try this for regular BERT and see if you get the same pattern?

Context embedding shows anomaly, independent of sentence and token

As I said, I don't think this is a bug, it is just how the model decided to represent your tokens. As to the similarity measures, maybe normalizing the vector...

AllenNLP dependency and upgrade path

1- The official package is better if it has gradient accumulation (they have an open PR for it https://github.com/allenai/allennlp/pull/3051) 2- What do you mean by regular dependency?

Add checks to confirm that the checkpoint conversion script works perfectly correct

[Here](https://huggingface.co/bigscience/gpt2-350m-en/tree/megatron-deepspeed)'s a megatron-deepspeed checkpoint and [here](https://huggingface.co/bigscience/gpt2-350m-en/tree/main)'s the corresponding HF-transformer checkpoint. We just need to verify that these two are the same.

Add checks to confirm that the checkpoint conversion script works perfectly correct

@stas00?

Create AWS image with our tools and codebase

Yes, to run Meg-DS training. Basically doing the steps listed in readme here https://github.com/bigscience-workshop/Megatron-DeepSpeed for them so that they only need to run the `pretrain_*` script.

Create AWS image with our tools and codebase

@jaketae can be the first user of the AMI

Configs for LUMI ablations

Dirk's config is this branch https://github.com/allenai/LLM/tree/DirksRun2