multi-head-attention topic
SentEncoding
Sentence encoder and training code for Mean-Max AAE
Att-Induction
Attention-based Induction Networks for Few-Shot Text Classification
Multi2OIE
Multi^2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT (Findings of ACL: EMNLP 2020)
attention
several types of attention modules written in PyTorch
scDINO
Self-Supervised Vision Transformers for multiplexed imaging datasets
flash_attention_inference
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
decoding_attention
Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.