Kexin Wang comments

Results 32 comments of


                                            Kexin Wang

issue while running the training script

Now mnrl_** are set to `None` by default and one will not be bothered with MNRL (i.e. the baseline QGen) issues: https://github.com/UKPLab/gpl/pull/12

issue while running the training script

Hi @christopherfeld, I have created google colab showing how to run this toy example. Please have a look at here: https://colab.research.google.com/drive/1Wis4WugIvpnSAc7F7HGBkB38lGvNHTtX?usp=sharing and hope this can help:)

TSDAE + GPL and TAS-B + GPL

Hi @Yuan0320, thanks for your attention! Sorry that the description about this in the paper is kinda misleading. It is composed of three training stages: `(1) TSDAE on ${dataset} ->...

TSDAE + GPL and TAS-B + GPL

Sorry that there is currently no one-step solution for this. To reproduce it, please run these one by one: (1) Train a TSDAE model on the **target** corpus: https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/TSDAE. Note...

TSDAE + GPL and TAS-B + GPL

You can use this file https://sbert.net/datasets/msmarco-hard-negatives.jsonl.gz instead. The format is different from our `gpl-training-data.tsv`, but thankfully @nreimers has ever wrapped the code of training this zero-shot baseline in a single...

TSDAE + GPL and TAS-B + GPL

I think you just need to change `--retriever_score_functions "cos_sim" "cos_sim"` into `--retriever_score_functions "dot"`, since you only have one negative miner and the miner is the TAB-B model, which was trained...

TSDAE + GPL and TAS-B + GPL

Hi @ArtemisDicoTiar, your understanding is correct. "TSDAE"s in table 1 and 9 mean the same method, i.e. TSDAE (target → MS-MARCO). "Target" here means a certain dataset from the target...

KeyError during pseudo labeling

Thanks for both of your attention @ahadda5 @sudhanshu-shukla-git! I will add a type assertion ```assert type(did) == str``` here. This setting follows the one in the [BeIR](https://github.com/beir-cellar/beir) repo. I think...

KeyError during pseudo labeling

Have added the type hints and assertion: https://github.com/UKPLab/gpl/pull/12

Multi-lingual GPL

@Matthieu-Tinycoaching Sorry that I have not studied multilingual scenarios myself and this is beyond the scope of my knowledge. As @nreimers said, maybe you can test it and compare different...