Tri Dao

Results 447 comments of Tri Dao
trafficstars

nvcr 23.12 uses pytorch nightly 2.2.0.dev20231106. flash-attn compiled wheels with pytorch nightly up to version 2.5.1, after that pytorch 2.2.0 official was released and we compiled wheels with pytorch 2.2.0....

Idk pytorch / cuda compatibility is messy. nvcr pytorch 23.10 uses pytorch 2.1.0a0+32f93b1. I think our wheels are compile with official pytorch 2.1.0. The two wheels might not be compatible.

12.3 and 12.2 should be compatible. I've just tried nvcr pytorch 23.12 and it works fine ``` docker run --rm -it --gpus all --network="host" --shm-size=900gb nvcr.io/nvidia/pytorch:23.12-py3 pip install flash-attn==2.5.1.post1 ipython...

Why do you use the url directly instead of pip? pip will run setup.py to choose the correct wheel. In this case you want the wheel to have `abiTRUE` not...

Try following this? ``` docker run --rm -it --gpus all --network="host" --shm-size=900gb nvcr.io/nvidia/pytorch:23.12-py3 pip install flash-attn==2.5.1.post1 ```

`cd csrc/rotary && python setup.py install`

`rotary_emb` is not part of the flash attention package, you don't have to use it. You can also use pip, sth like `pip install "git+https://github.com/Dao-AILab/flash-attention.git#subdirectory=csrc/rotary"`, which does the same thing...

It's the Triton version. As mentioned at the beginning of the file: ``` Tested with triton==2.0.0.dev20221202. Triton 2.0 has a new backend (MLIR) but seems like it doesn't yet work...