Chandrashekar KP issues

Repositories
Issues
Comments

Results 1 issues of


                                            Chandrashekar KP

ValueError: FalconMambaForCausalLM does not support Flash Attention 2.0 yet

I'm facing issues while inferencing while using falcon LLM. The latency is around 20-30 minutes for a specific use case. I want to reduce the time and found that we...

Usage

Flash Attention