Vitaliy Chiley

Results 64 comments of Vitaliy Chiley

this cant be done within a setup.py file in my project...

If you are able to modify the code, could you try setting `inplace` [here](https://github.com/mosaicml/llm-foundry/blob/main/llmfoundry/models/layers/attention.py#L175) to `False`?

@samhavens should we also add the option to not predict BOS (assuming the previous tok is the end of the previous seq).

> The implementation currently supports multihead and grouped query attention. I was not able to find a good way to parallelize the attention bias with AliBi in this setting -...