Condenser icon indicating copy to clipboard operation
Condenser copied to clipboard

reproducing your results on MS MARCO

Open Narabzad opened this issue 3 years ago • 8 comments

Hi,

Thank you for your great work! I am willing to replicate your results on MS MARCO passage collection and I have a question regarding Luyu/co-condenser-marco model. Is this the final model that you used to retrieve documents? Or do I need to train it on MS MARCO relevant query/passage pairs? Is it possible to provide a little bit more detail on how should I use your dense toolkit with this model?

Thank you in advance!

Narabzad avatar Sep 23 '21 16:09 Narabzad

Hello,

Please take a look at the coCondenser fine-tuning tutorial. It should answer most of your questions.

We can leave this issue open for now in case you run into other problems.

luyug avatar Sep 24 '21 13:09 luyug

Thank you for the great tutorial ! Just one issue that I have found is --passage_reps corpus/corpus/'*.pt' should be --passage_reps encoding/corpus/'*.pt' in this link https://github.com/texttron/tevatron/tree/main/examples/coCondenser-marco#index-search

Narabzad avatar Sep 30 '21 14:09 Narabzad

Thanks for catching that!

luyug avatar Oct 01 '21 18:10 luyug

Hi,

I was able to replicate the MRR@10 that you reported in the paper ( 0.38) but I was wondering what is the difference between the number that is reported on the leaderboard ( 0.44) vs 0.38? How do I replicate that? is it on a different set?

Narabzad avatar Oct 13 '21 19:10 Narabzad

Hi, @luyug

Thanks for your awesome work. I have similar question on NQ. Is it possible to give more details to reproduce the results (84.3=MRR@5) on NQ in the paper, just like the detailed MS MARCO tutorial demo?

Or if it need some time, could you tell me whether your SOTA model on NQ is trained with mined hard negatives or with both BM hard negatives and mined hard negatives as DPR github?

Thanks.

shunyuzh avatar Oct 18 '21 07:10 shunyuzh

Hi @luyug,

Thanks for your great work! I also have the confusion about the difference between the reported result and leaderboard (0.38 vs. 0.44). Is there any update on this?

Yuan0320 avatar Nov 04 '22 12:11 Yuan0320

Also interested, from what I remember the main difference is that there's also a reranker applied, would it be possible to get the checkpoint of the reranker?

cadurosar avatar Nov 04 '22 12:11 cadurosar

Hi, Thank you for your great work! I encounter some issues when I tried to reproduce the results on MARCO passage. I have referred to the aforementioned tutorial, but still cannot solve it (the problem seems to be in the step of mining hard negatives).

First, I run Fine-tuning Stage 1 with

CUDA_VISIBLE_DEVICES=3 python -m tevatron.driver.train \
  --output_dir model_msmarco_s1 \
  --model_name_or_path ../data/co-condenser-marco \
  --save_steps 20000 \
  --train_dir ../data/msmarco-passage/train_dir \
  --data_cache_dir ../data/msmarco-passage-train-cache \
  --fp16 \
  --dataloader_num_workers 2 \
  --per_device_train_batch_size 8 \
  --train_n_passages 8 \
  --learning_rate 5e-6 \
  --q_max_len 16 \
  --p_max_len 128 \
  --num_train_epochs 3 \
  --logging_steps 500 \

, and get MRR@10=0.3596, R@1000=0.9771. (Your reported results are MRR@10=0.357, R@1000=0.978).

Then, I run the hard negative mining with random sampling 30 negatives from the top-200 retrieval results of model_msmarco_s1 by modifying scripts/hn_mining.py (according to the parameters in build_train_hn.py).

Second, I run Fine-tuning Stage 2 with

CUDA_VISIBLE_DEVICES=3 python -m tevatron.driver.train \
  --output_dir model_msmarco_s2 \
  --model_name_or_path ../data/co-condenser-marco \
  --save_steps 20000 \
  --train_dir ../data/msmarco-passage/tain_dir_hn_dr_cocondenser200 \
  --data_cache_dir ../data/msmarco-passage-tain_hn_dr_cocondenser200-cache \
  --fp16 \
  --dataloader_num_workers 2 \
  --per_device_train_batch_size 8 \
  --train_n_passages 8 \
  --learning_rate 5e-6 \
  --q_max_len 16 \
  --p_max_len 128 \
  --num_train_epochs 2 \
  --logging_steps 500 \

, and get MRR@10=0.3657, R@1000=0.9761. (Your reported results are MRR@10=0.382, R@1000=0.984).

There are several possible issues that I would like to confirm:

  1. The training data for Fine-tuning Stage 2 only is hard negatives, having not been concatenated with BM25 negatives?
  2. The initial parameters are from co-condenser-marco, not the checkpoint of model_msmarco_s1?
  3. The setting of per_device_train_batch_size, train_n_passages, learning_rate, and num_train_epochs in Fine-tuning Stage 2 ?

Thank you in advance!

caiyinqiong avatar Apr 07 '23 07:04 caiyinqiong