uaggan
uaggan copied to clipboard
Using mask for D
In paper, authors have mentioned that they have not used mask till 30 epochs for D. Also, they have not trained attention portion after 30 epochs. Have you used these conditions in codes?