torchscale issues

Results 26 torchscale issues

Sort by recently updated

retnet traning config

Hello, I have followed the training configuration introduced here (https://github.com/microsoft/torchscale/issues/52) with retnet_medium architecture. I have some questions that I would appreciate if anyone could answer them. The first is about...

hanlinxuy

Config fix

I've rewritten the `torchscale.architecture.config` module to use inheritance and remove the redundant code. There are now 3 classes: `Config` - that holds all common options `EncoderConfig` - inherits 'Config' and...

agoryuno

Remove inheritance from `object`

5 classes in the codebase inherit from `object` for some reason. I am guessing it was some sort of an oversight.

agoryuno

testing very large attention windows

This is kind of a simple-minded question, but what do I do if I want to see for myself that I can process a huge attention window using torchscale? Ideally,...

fredzannarbor

Swapped naive dot product attention for flash attention

This pull request adds support for the Flash Attention mechanism to the MultiheadAttention module. Flash Attention is a recently proposed alternative to the conventional multi-head attention mechanism which reduces memory...

usryokousha

About running speed

Thanks for your excellent work! I have mentioned that torchscale serially executes the operation of mapping x to q, k, and v, in line 84~86 in file torchscale/component/multihead_attention.py. Will this...

NieShenRuc

torchscale
torchscale copied to clipboard

Metadata

retnet traning config

Config fix

Remove inheritance from `object`

testing very large attention windows

Swapped naive dot product attention for flash attention

About running speed

← Metadata

Owner

Metadata

torchscale torchscale copied to clipboard

Metadata

retnet traning config

Config fix

Remove inheritance from `object`

testing very large attention windows

Swapped naive dot product attention for flash attention

About running speed

← Metadata

Owner

Metadata

torchscale
torchscale copied to clipboard