annotated_deep_learning_paper_implementations icon indicating copy to clipboard operation
annotated_deep_learning_paper_implementations copied to clipboard

๐Ÿง‘โ€๐Ÿซ 60 Implementations/tutorials of deep learning papers with side-by-side notes ๐Ÿ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...

Results 88 annotated_deep_learning_paper_implementations issues
Sort by recently updated
recently updated
newest added

What is the main purpose or outcome of this training? is it only for text completion? How do I finetune it for Question-Answering or any other downstream task? I want...

I noticed that the Chinese translation appears to be machine-generated. Do you have any plans to officially translate this excellent project? If needed, I am willing to take on the...

The title explains itself. Here are some relevant resources: - Official Mamba repository: [https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py) - Original paper: [https://arxiv.org/abs/2312.00752](https://arxiv.org/abs/2312.00752) - Additional resources: - [A Visual Guide to Mamba and State Space...

When training with the celeba dataset, the program encountered an error. ![Snipaste_2024-08-14_15-17-03](https://github.com/user-attachments/assets/480a70d7-e466-47dd-a54b-c57d46c1fd19)

An implementation of LORA and other tuning techniques would be nice.

Hi, thank you for your work. I noticed an error in the RoPE inner product equation. Additionally, this implementation uses a different feature pairing strategy for feature subspaces rotation compared...

I wonder why array shapes in aha are (C, B, D) rather than (B, C, D). I thought it was convention that the batch was the first dimension. Specially, here...

Hello, Iโ€™ve been learning various AI/ML-related algorithms recently, and my notes are quite similar to the content of your repository. Also this excellent work has helped me understand some of...

PLEASE CORRECT ME IF IM WRONG. I believe the line ` attn = attn.softmax(dim=2)` is incorrect. https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/05321d644e4fed67d8b2856adc2f8585e79dfbee/labml_nn/diffusion/ddpm/unet.py#L188 Dim 1 contains the index (i) over the query sequence entries, and dim...

![Image](https://github.com/user-attachments/assets/67d19be6-67a8-4b3c-9bdb-422c7c50f87d)