Varun Gumma

Results 8 comments of Varun Gumma

@AIikai @robotsp I am also looking to distill and prune a few LLMs. Any leads?

@robotsp @AIikai @HeegonJin please redirect [here](https://github.com/facebookresearch/fairseq/issues/4738) for KD in fairseq

@HeegonJin I a basic implementation of KD in my repo [here](https://github.com/VarunGumma/fairseq) It is based on the implementation of https://github.com/LeslieOverfitting/selective_distillation They have a much older version of `fairseq` and I have...

Please use the latest version of my code and you can find an example of `knowledge_distillation_translation` in the examples folder. As this work is under progress, I make multiple bug...

@HeegonJin I use a custom model architecture which I defined in a file in that directory `$custom_model_dir`. If you are using models (parent and student) which are defined in `fairseq`...

> > @HeegonJin I a basic implementation of KD in my repo [here](https://github.com/VarunGumma/fairseq) It is based on the implementation of https://github.com/LeslieOverfitting/selective_distillation They have a much older version of `fairseq` and...

Will fairseq-v2 support Pytorch2.0?

any update on v2?