recurrent-memory-transformer-pytorch icon indicating copy to clipboard operation
recurrent-memory-transformer-pytorch copied to clipboard

Question: configuring scaled_dot_product_attention

Open pfeatherstone opened this issue 1 year ago • 0 comments

it looks like from https://github.com/lucidrains/recurrent-memory-transformer-pytorch/blob/98bf3091a29fbd65dbbb30ce00dd1cadd05fef2d/recurrent_memory_transformer_pytorch/attend.py#L62-L67 and https://github.com/lucidrains/recurrent-memory-transformer-pytorch/blob/98bf3091a29fbd65dbbb30ce00dd1cadd05fef2d/recurrent_memory_transformer_pytorch/attend.py#L93-L99 we manually configure F.scaled_dot_product_attention(). From the documentation it says "All implementations are enabled by default. Scaled dot product attention attempts to automatically select the most optimal implementation based on the inputs." Can't we just let pytorch decide?

pfeatherstone avatar Aug 08 '23 09:08 pfeatherstone