xwuShirley

Results 5 issues of xwuShirley

Dear authors of TinyBERT, Thanks for the great work. TinyBERT is really a great work and clarifies the ways to do distillation. We would like to apply your final models...

I am a beginner of PyTorch. Don't we need to write something like a backward function, like the last section in your paper https://arxiv.org/abs/1806.10779 (the appendix, equation(6)-(11) )? I thought...

Dear Author @SunbowLiu , I saw your issue regarding to multiple-GPUs generation. https://github.com/pytorch/fairseq/issues/1937 I wonder if your released code here actually do multiple GPU generation here. Or you only put...

Hi there @madlag, Thanks for your great work! It seems there is a problem for MNLI if we update text_classification/parameters.json with **do_train: 0** and run following, ``` mkdir result export...

Hi @kaiyux, We're curious about the details in this blog post: https://developer.nvidia.com/blog/nvidia-blackwell-delivers-world-record-deepseek-r1-inference-performance/ Specifically, could you share the configuration used to reproduce the results shown in the image below for H200...

triaged