Enrico Shippole comments

Results 155 comments of


                                            Enrico Shippole

Training Loss and Experiments

@lucidrains Here are the runs for PaLM **with** flash-cosine-sim-attention. - 6.52s/it - Sequence Length 8192 - fp32 For 14k steps: ![Screenshot from 2022-11-10 14-40-17](https://user-images.githubusercontent.com/25208228/201191025-49638a4b-edef-42e8-9640-ee3989118a43.png) And for the whole training run:...

Training Loss and Experiments

@lucidrains Here is the code for a ViT-16 **with** flash-cosine-sim-attention: ```python import torch from torch import nn from einops import rearrange, repeat from einops.layers.torch import Rearrange from flash_cosine_sim_attention import flash_cosine_sim_attention...

Training Loss and Experiments

@lucidrains Here are the results for training the ViT-16 **with** flash-cosine-sim-attention on CIFAR10 for 100 epochs. Train and Validation loss: ![Screenshot from 2022-11-12 10-54-01](https://user-images.githubusercontent.com/25208228/201482586-e49f4df5-13be-4ef4-bcda-a733ff38bd53.png) Train and Validation accuracy: ![Screenshot from...

Enrico Shippole

Training Loss and Experiments

Training Loss and Experiments

Training Loss and Experiments

Training Loss and Experiments

Training Loss and Experiments

Training Loss and Experiments

Pre Trained Model?

✨ 😅 Is possibale to use the ChatGPT of OpenAI to train this ChatGPT?

how can i install this?

Deepspeed Ulysses