Varun Gumma
Varun Gumma
@EmreOzkose Yes, Pytorch2.0 supports Python 3.11
Alternately, you can use [my fork of fairseq](https://github.com/VarunGumma/fairseq) which supports `Python 3.11`, Knowledge Distillation, Adapters a few more interesting fixes.
Hi, when you use ```torch.compile```, do you get a bunch of logging messages? I tried adding ```torch.compile``` exacty the same way you did, but my terminal is flooded with warnings...
@santha96 did you just leave the logging messages like that, or were you able to suppress them?
Hi @bhavitvyamalik, I have a [clone of fairseq](https://github.com/VarunGumma/fairseq) that implements adapters. Feel free to use it, and if you face any issues or want more features, open a pull request,...
Hi @EIFY, I have been using your implementation of [fairseq](https://github.com/EIFY/fairseq), and I had the following question: - In the [transformer_decoder](https://github.com/EIFY/fairseq/blob/main/fairseq/models/transformer/transformer_decoder.py), I see that the alibi bias is being added to...
@gwenzek any update on the documentation
Hi @GokulNC, do you have any leads on this? We are interested to try this for IT2-Dist models as well, where the embedding and output-projection (`lm_head`) are tied in the...
@umarbutler can you share your implementation?
@tfglynn, can xPos be used for a regular regular encoder-decoder model? If so, from your answer above, I assume that it should be added to the decoder side only and...