dianaml0

Results 7 issues of dianaml0

# Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [...

CLA Signed

# Before submitting - [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)? - [...

CLA Signed

## What does this PR do? Adds Triton Flash Attention Performance Compared to Vanilla ``` [--------- attention (attn_bias=) --------] | optimized | vanilla 1 threads: -------------------------------------------------- f16 B=384, M=197, H=1,...

CLA Signed

## What does this PR do? CircleCI [failing with timeout](https://app.circleci.com/pipelines/github/facebookresearch/xformers/2002/workflows/bd88d6c1-ee25-4e6a-8dbb-b00baa2b9a2a/jobs/4864). ## Before submitting - [ ] Did you have fun? - Make sure you had fun coding 🙃 - [...

CLA Signed

# 🚀 Feature Capturing idea from @blefaudeux [here](https://github.com/facebookresearch/xformers/pull/145#issuecomment-996223858). Add LRA to the CI since some bugs are only showing up through manual runs of it. Should be a smaller version...

**Patch Description** Update CircleCI **Testing steps** Describe how you tested your changes

cla signed

**Patch Description** Creating this PR off of #511, so it can be reviewed by @stephenroller The last commit (3d709dba5c4be713fd821dc4e0f6b6f90f5ead40) removes some changes from the sequence parallel code which enabled testing...

cla signed