benchmark
benchmark copied to clipboard
Fix llama_v2_7b_16h for torch.jit.trace
Original error: Attention using SDPA can not be traced with torch.jit.trace when no attention_mask is provided. To solve this issue, please either load your model with the argument attn_implementation="eager" or pass an attention_mask input when tracing the model.