aft-pytorch
aft-pytorch copied to clipboard

Published 20 hours ago •

rish-16

→

Metadata

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

Reame
Issues

Results 3 aft-pytorch issues

Sort by recently updated

can run on cpu but failed in gpu,why?

1

RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for baddbmm) i set .cuda()...

TangDL

I test the model in an NLP task.

1

I use aft_full model，6 layers. and I use it in init with this code: ``` self.encoder_transformer = nn.ModuleList() for _ in range(6): self.encoder_transformer.append(AFTFull(max_seqlen=500, dim=512,hidden_dim=256)) ``` and in forward function, I...

zshy1205

How to init the additional param w?

I want to migrate a existing llm to this arch. There is a additional param w. How to init it?

wizardforcel

About

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

221

Stars

23

Forks

Watchers

Owner

rish-16

← Metadata

221

Stars

23

Forks

Watchers

Owner

rish-16

Metadata

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

Back

aft-pytorch aft-pytorch copied to clipboard

Metadata

can run on cpu but failed in gpu,why?

I test the model in an NLP task.

How to init the additional param w?

← Metadata

Owner

Metadata

aft-pytorch
aft-pytorch copied to clipboard