OmDet
OmDet copied to clipboard
Batched MultiHeadAttention
Hi! Yoni from Hugging Face again. I'm opening a separate issue because there seems to be a potentially important problem in the model's encoder.
https://github.com/om-ai-lab/OmDet/blob/542ce974ee22e16f9e532500e3f84e4702c03abf/omdet/omdet_v2_turbo/ela_encoder.py#L27
Shouldn't this MultiHeadAttention be initialized with batch_first=True
, as the inputs of the self_attn layer are of the shape (batch_size, ...)? This causes inconsistencies when using the model for batch inference.
Thanks for your consideration!