mengzi-retrieval-lm icon indicating copy to clipboard operation
mengzi-retrieval-lm copied to clipboard

Langboat/ReGPT-125M-200G score isn't reproducable

Open iamtrask opened this issue 2 years ago • 2 comments

When I run:

python main.py \
    --model retrieval \
    --model_args pretrained=Langboat/ReGPT-125M-200G \
    --device 0 \
    --tasks wikitext  \
    --batch_size 1

I get the following:

  "config": {
    "model": "retrieval",
    "model_args": "pretrained=Langboat/ReGPT-125M-200G",
    "num_fewshot": 0,
    "batch_size": 1,
    "device": "0",
    "no_cache": false,
    "limit": null,
    "bootstrap_iters": 100000,
    "description_dict": {}
  }
}
retrieval (pretrained=Langboat/ReGPT-125M-200G), limit: None, provide_description: False, num_fewshot: 0, batch_size: 1
|  Task  |Version|    Metric     | Value |   |Stderr|
|--------|------:|---------------|------:|---|------|
|wikitext|      1|word_perplexity|36.1793|   |      |
|        |       |byte_perplexity| 1.9563|   |      |
|        |       |bits_per_byte  | 0.9681|   |      |

when I believe it should be getting closer to 22 word perplexity (According to the readme).

iamtrask avatar Sep 09 '23 05:09 iamtrask

When I reduce:

python -u download_index_db.py  --num 200

from 200 down to 10 i.e.

python -u download_index_db.py  --num 10

the score is still EXACTLY the same (37.1793). I checked that I cleared the cache and such. It actually re-tests. I also checked that it does in fact talk to the server holding data.

This makes me think that the model isn't actually using what it gets back from the queries. Because changing the database doesn't change the score.

iamtrask avatar Sep 09 '23 05:09 iamtrask

I'm sorry for the late reply, this project is no longer maintained, and I and another contributor have left the company. It is recommended to use the RETRO implementation in Megatron-LM, it seems that they are still continuing the research in this direction (RETRO++). If you are interested in exchanging research idea in this area, you can DM me on Twitter, I am still continuously following the research progress in this area.

Ag2S1 avatar Nov 20 '23 17:11 Ag2S1