Shamane Siri
Shamane Siri
Hi , I am using BERT based UDA. When I calculated the KL loss between augmented text and unaugmented text, it is similar to when I send the same text...
Did you use this loss ?
The forward function of the TransformerEncoderLayer can have **src_key_padding_mask**. Maybe we can update it too.
What kind of search method are you using? do you use faiss library?
Can we increase the sequence length?
Here, after a given number of episodes(Bath Size) we train the A3C agent with calculating the return. So we need to feed states, return, advantage function as a batch to...
# 🌟 New model addition Basically a retrieval augmented model like RAG, but without expensive retriever end2end training ## Open source status * [ ] the model implementation is available:...