Mostafa Elhoushi
Mostafa Elhoushi
We need to deal with unsigned tensors where there is no sign bit which is the case with activations to convolution that are usually the output of a Relu layer....
I am trying to run the test cases, but ran into errors like: > pandas.util.version.InvalidVersion: Invalid version: '0.14.0.RAY' Reinstalling an older version of pandas probably solved this error but lead...
Thanks for sharing the code. I have noticed a small typo in the last 2 commands in the README file: ``` python cifar-nuclear-regularization.py.py ``` there's an extra `.py` in the...
**Describe the solution you would like:** Create a fairseq2 wrapper class/script to enable LM Evaluation Harness, https://github.com/EleutherAI/lm-evaluation-harness. This We need to create a [wrapper class](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/docs/interface.md#external-library-usage) that implements the following functions:...
**Describe the solution you would like:** Implement self-speculative decoding as described in this [paper](https://arxiv.org/abs/2404.16710) where the earlier layers act as the draft stage and remaining layers act as the verification...
**Describe the solution you would like:** - Enable the training script to access outputs of intermediate layers - Modify loss function to incorprate outputs of earlier layers **Describe the alternatives...
**Describe the solution you would like:** Would like to enable configuration of a different layer dropout rate for each layer. **Describe the alternatives you have considered:** Currently, layer dropout is...
I have created this PR to enable generation tasks. To test: ``` python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --tasks nq_open ```
#### Context What is the purpose of this PR? Is it to - [X] add a new feature - [ ] fix a bug - [ ] update tests and/or...
Still work in progress. To run: ``` cd torchao/_models/llama python generate.py --checkpoint_path ${CHECKPOINT_PATH}/model.pth --superblock ```