Caleb Thomas
Caleb Thomas
I'm not sure if this is the same issue that @willx-y has. But, I got this same error message when running inference on a protein FASTA with the `--use_precomputed_alignments` option...
I had the same problem. I was able to successfully work around it by cloning the git repo and installing from source: ```sh git clone https://github.com/Dao-AILab/flash-attention.git cd flash-attention git checkout...
> > I had the same problem. I was able to successfully work around it by cloning the git repo and installing from source: > > ```shell > > git...
Apparently this is a known issue with Hopper architecture https://github.com/triton-lang/triton/pull/2627
In case anyone else has a similar problem, I was able to successfully work around the issue by removing all `num_warps = 8` autotune configurations from my flash attention kernel