Titans
Titans copied to clipboard
A collection of models built with ColossalAI
### 🐛 Describe the bug **I wrote the following code, I think the dimentions of tensors are correct. What should I do?** class LlamaMLP(nn.Module): def __init__( self, hidden_size: int, intermediate_size:...
### 🐛 Describe the bug Creating an TransformerEncoder causes memory overflow, but the same config works with the huggingface `transformers` module. ```python # config.py from colossalai.amp import AMP_TYPE fp16=dict( mode=AMP_TYPE.TORCH...
### Describe the feature Compared to vanilla PyTorch, Titans right now includes many unnecessary codes for example multiple files for MLP. We could provide common MLP, Attention ... modules for...
- move the dataloader out from the model folder
For complex layers such as self-attention, we can added shape annoation (e.g. `# hidden states (b, s, h)`) so that the other maintainer can better understand the model logic.
These tests aim to make sure the models run the forward correctly with a given mock data.