flash-linear-attention icon indicating copy to clipboard operation
flash-linear-attention copied to clipboard

[Bug] Throughput benchmarking script fails

Open sustcsonglin opened this issue 6 months ago • 1 comments

Checklist

  • [x] I have checked FAQs and existing issues for similar problems
  • [x] My GPU is H100 and I have installed triton-nightly built by fla team, and double checked FAQs
  • [x] Please report this bug in English to ensure wider understanding and support

Describe the Bug

        ^^^^^^^^^^^^^^^^^^^^^

File "/data/cl/user/yangsl66/miniconda3/envs/fla/lib/python3.12/site-packages/fla/models/transformer/modeling_transformer.py", line 374, in forward logits = None if fuse_linear_and_cross_entropy else self.lm_head(hidden_states[:, -logits_to_keep:]) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Boolean value of Tensor with more than one value is ambiguous

Steps to Reproduce the Bug

N/A

Expected Behavior

N/A

Environment Information

  1. Torch:
  2. Triton:

sustcsonglin avatar May 16 '25 11:05 sustcsonglin

Maybe related to https://github.com/fla-org/flash-linear-attention/pull/401

zhiyuan1i avatar May 16 '25 21:05 zhiyuan1i

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Jun 22 '25 00:06 github-actions[bot]