pythia
pythia copied to clipboard
Hello team, When I do: ``` from transformers import AutoTokenizer pretrained_model = "EleutherAI/pythia-160m" tokenizer = AutoTokenizer.from_pretrained( pretrained_model, padding_side="left", cache_dir=pretrained_model+'_tokenizer', ) print(tokenizer.pad_token) ``` It seems like the `pad_token` is empty (`None`...
Do we have instruct-tuned versions of Pythia models? So that we can do conversational inference.
Hi, I found that the init method of parameters in pythia-6.9B model is inconsistent with the standard deviation of the [step0 checkpoint](https://huggingface.co/EleutherAI/pythia-6.9b/tree/step0). Table 6 in the [paper](https://arxiv.org/abs/2304.01373) shows that init-method...
Hi, I was wondering can you provide the index_mapping files that is generated by the [GPT2Dataset](https://github.com/EleutherAI/gpt-neox/blob/03186decef022dc35e6adee1a66619968812e0a9/megatron/data/gpt2_dataset.py#L29)? From the construction of gpt2dataset at [here](https://github.com/EleutherAI/gpt-neox/blob/03186decef022dc35e6adee1a66619968812e0a9/megatron/data/gpt2_dataset.py#L158), I can see there are three `npy`...
Hi folks -- thanks for the great work on this. I've been doing some fine-tuning experiments off the huggingface checkpoints and was wondering whether anyone has converted the neox optimizer...
Hello everyone! I found a weird inconsistency in the tokenizer vocabulary. I wanted to ask why this could be happening. I have loaded a tokenizer from HF: ``` tokenizer =...
Task description: "Collect all loss values into CSV files from WandB and -- if needed -- log files". The most important file is `pythia_runs.tsv` in which I manually collect the...
Per [this](https://github.com/EleutherAI/gpt-neox/pull/1144), my understanding is that the ```gas``` config in neox doesn't do anything, and shouldn't be used, and should be removed. We should be using ```gradient_accumulation_steps``` instead. It [appears](https://github.com/search?q=repo%3AEleutherAI%2Fpythia+gas&type=code&p=1)...
I followed readme: ``` git lfs clone https://huggingface.co/datasets/EleutherAI/pythia_deduped_pile_idxmaps python utils/unshard_memmap.py --input_file ./pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00000-of-00082.bin --num_shards 83 --output_dir ./pythia_pile_idxmaps/ ``` I got a 600+G file, and then I used gpt-neox's dataloader to read...
fixed invalid filename.