roymiles
roymiles
Hi, thank you for releasing this project! I was wondering if you happen to have the pre-trained weights for the models finetuned on the different downstream tasks (QQP, MNLI.. etc)....
Hi, Will you provide the training code to replicate the results from the paper and generate the frozen Tensorflow graph model? Thanks
Could you include a comparison with Information theoretic representation distillation. Paper: https://bmvc2022.mpi-inf.mpg.de/0385.pdf Code: https://github.com/roymiles/ITRD
Could you add Information theoretic representation distillation. Paper: https://bmvc2022.mpi-inf.mpg.de/0385.pdf Code: https://github.com/roymiles/ITRD
Hi, I was just wondering, were you able to replicate the results in the original paper using this implementation? Thanks
Hi, thanks for releasing this work! it has all been very interesting to read. However, I do have a few questions regarding your results and methodology. 1. For table 4....