Lisennlp
Lisennlp
你好,Self-instruction数据可以共享下嘛?
I followed readme: ``` git lfs clone https://huggingface.co/datasets/EleutherAI/pythia_deduped_pile_idxmaps python utils/unshard_memmap.py --input_file ./pythia_deduped_pile_idxmaps/pile_0.87_deduped_text_document-00000-of-00082.bin --num_shards 83 --output_dir ./pythia_pile_idxmaps/ ``` I got a 600+G file, and then I used gpt-neox's dataloader to read...
Hello, I see your batch_view.py, found that the data does not use a shuffle, but in the gpt-neox library, the data is shuffled. So I want to make sure that...
I used the aqt_einsum function in the code to only quantify the qk sccore, and then trained the model. However, I found that the loss dropped very slowly after training...
I am very interested in this work, but I found that the code structure is too deep, many configs are hidden deeply, and they are all hard-coded, making it difficult...