MiniCPM
MiniCPM copied to clipboard
Due to Flashattention, inference cannot be performed on v100
Description / 描述
FlashAttention only supports Ampere GPUs or newer.
Case Explaination / 案例解释
Due to Flashattention, inference cannot be performed on v100
done with caseid 592863
Hi, you can use eager mode for inference.