Shitao Xiao comments

Results 509 comments of


                                            Shitao Xiao

Need help with English datasets

Hi, thanks for your interest in our work! We use the entire corpus to do pretrain and fine-tune.

Which version is used for the retrieval?

> Based on the number of samples I think CMedQAv2 is the same data as CmedqaRetrieval. Maybe @staoxiao can confirm Yes, CmedqaRetrieval is v2.

When do you plan to release code for RetroMAE v2?

Thanks for your interest in our work! The code for RetroMAE-2 has been released, and you can use it in https://github.com/staoxiao/RetroMAE/tree/master/examples/pretrain#pre-train (`dupmae`)

Question about enhanced decoding

Hi, thanks for your interest in our work! We use the _whole_word_mask function to mask tokens, which will not mask the CLS token. You can refer to https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/data/data_collator.py#L845.

Question about enhanced decoding

> > Hi, thanks for your interest in our work! We use the _whole_word_mask function to mask tokens, which will not mask the CLS token. You can refer to https://github.com/huggingface/transformers/blob/v4.34.1/src/transformers/data/data_collator.py#L845....

Question about enhanced decoding

> Actually, while you're here, a question about DupMAE. It looks like the modeling_duplex file you included doesn't have the actual pooling operation you describe in the paper - how...

Question about enhanced decoding

> Thank you - the rest i can get from the paper. Is there a reason you didn’t train BGE with DupMAE instead of RetroMAE? > > > > Actually,...

Pretraining accuracy for retromae v1

Hi, thanks for your interest in our work! Actually, we didn't test the mlm accuracy of retromae on any data. We view the retrieval performance after fine-tuning as the quality...

What version of cuda is suitable, i tried the 11.7 but failure.

The version of cuda does not affect the training, it may be the transformers version that causes the problem. You can try to reduce the version to 4.18, or use...

Some questions about the checkpoint

Thanks for your interest in our work! We use 8*GPUs to fine-tune the model, which can increase the number of in-batch negatives since we share the negatives across GPUs. Using...