mengzi-retrieval-lm
mengzi-retrieval-lm copied to clipboard
An experimental implementation of the retrieval-enhanced language model
When I run: ``` python main.py \ --model retrieval \ --model_args pretrained=Langboat/ReGPT-125M-200G \ --device 0 \ --tasks wikitext \ --batch_size 1 ``` I get the following: ``` "config": { "model":...
config.json has the wrong IP address for the indexer. I am running the index server on the same machine, so it needs to be "http://127.0.0.1:8000" instead of just "127.0.0.1" which......
The installation instructions include: ``` conda create -n mengzi-retrieval-fit python=3.7 ``` I found that this created loads of errors relating to importlib.metadata and importlib_metadata (not for the index but for...
Hi , Can you explain or give an example of what prompt we should be giving for Q&A The code mentions finding loss as a whole for a file, but...
我把retrival强制置none了,但是8张v100调用trainer训练时候还是非常的慢,大概一个小时训练3w条数据,请问是否有问题呀
Thanks for making your work public! Want to know how many computing resources were used for training and retrieval when you train the GPT-125M model?
After reading some issues, I realized that it would cost a lot of time to train and take a heavy resouce to build a model on my own env. So...
I found that the logic in prepare_load.py is different from dataset.py. prepare_load didn't filter the data which len(input_ids) < chunk_size like https://github.com/Langboat/mengzi-retrieval-lm/blob/9e370ee0fdd2236ebd5518c2cbc410e8d9894c23/train/dataset.py#L43. which one should i follow.
I have followed the training data setting as discussed in #9 and i used training the model with 200 retrieval index and https://github.com/Langboat/mengzi-retrieval-lm/blob/main/train/config.json. But I can't reproduce the ppl as...
The process stops while running the evaluation step for the model.