torchscale
torchscale
copied to clipboard
Published
20 hours ago
•
microsoft
Reame
Issues
Where is the offset implemented in Multi-head dilated attention ?
Open
AshStuff
opened this issue 2 months ago
• 0 comments
Apr 20 '24 19:04
AshStuff