CTranslate2
CTranslate2 copied to clipboard
feature request: support 4d attention masks
this is a Feature Request to implement custom 4D mask for Llama (and possibly any other model) similar to https://github.com/huggingface/transformers/pull/27539