E.g., which dataset it is (Movielens, EachMovie, or Netflix?), and how to split it into training set and test set as in your sigir paper?

Excuse me, I am really interested in your work, and when could you kindly release your code? Thanks a lot!

Do you have plans to support token_type_ids? This is important for QA and search model inference.

Great work, I have some questions below: class RetroMAECollator(DataCollatorForWholeWordMask): max_seq_length: int = 512 encoder_mlm_probability: float = 0.15 decoder_mlm_probability: float = 0.15 def __call__(self, examples): input_ids_batch = [] attention_mask_batch = []...

It seems that there is only loss_ab in the released code, right?