annotated_deep_learning_paper_implementations issues

How to use my own database for training and evaluating Retro for Question-Answering?

2

What is the main purpose or outcome of this training? is it only for text completion? How do I finetune it for Question-Answering or any other downstream task? I want...

Zahin112

Chinese Translation

4

I noticed that the Chinese translation appears to be machine-generated. Do you have any plans to officially translate this excellent project? If needed, I am willing to take on the...

pengchzn

Request for Implementation of Mamba Paper

The title explains itself. Here are some relevant resources: - Official Mamba repository: [https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py](https://github.com/state-spaces/mamba/blob/main/mamba_ssm/modules/mamba_simple.py) - Original paper: [https://arxiv.org/abs/2312.00752](https://arxiv.org/abs/2312.00752) - Additional resources: - [A Visual Guide to Mamba and State Space...

huyiwen

Unet error in DDPM

4

When training with the celeba dataset, the program encountered an error. ![Snipaste_2024-08-14_15-17-03](https://github.com/user-attachments/assets/480a70d7-e466-47dd-a54b-c57d46c1fd19)

TwinkleStarst

LORA

2

An implementation of LORA and other tuning techniques would be nice.

erlebach

Fix RoPE inner product equation & add note on the difference in implementation

Hi, thank you for your work. I noticed an error in the RoPE inner product equation. Additionally, this implementation uses a different feature pairing strategy for feature subspaces rotation compared...

thanhtcptit

mha.py array shapes

1

I wonder why array shapes in aha are (C, B, D) rather than (B, C, D). I thought it was convention that the batch was the first dimension. Specially, here...

erlebach

How to Contribute to This Repository

8

Hello, I’ve been learning various AI/ML-related algorithms recently, and my notes are quite similar to the content of your repository. Also this excellent work has helped me understand some of...

terancejiang

Attention takes softmax over wrong dimension

1

PLEASE CORRECT ME IF IM WRONG. I believe the line ` attn = attn.softmax(dim=2)` is incorrect. https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/05321d644e4fed67d8b2856adc2f8585e79dfbee/labml_nn/diffusion/ddpm/unet.py#L188 Dim 1 contains the index (i) over the query sequence entries, and dim...

Trezorro

安装不了labml_helpers包

![Image](https://github.com/user-attachments/assets/67d19be6-67a8-4b3c-9bdb-422c7c50f87d)

wsl1717

annotated_deep_learning_paper_implementations
annotated_deep_learning_paper_implementations copied to clipboard

Metadata

How to use my own database for training and evaluating Retro for Question-Answering?

Chinese Translation

Request for Implementation of Mamba Paper

Unet error in DDPM

LORA

Fix RoPE inner product equation & add note on the difference in implementation

mha.py array shapes

How to Contribute to This Repository

Attention takes softmax over wrong dimension

安装不了labml_helpers包

← Metadata

Owner

Metadata

annotated_deep_learning_paper_implementations annotated_deep_learning_paper_implementations copied to clipboard

Metadata

← Metadata

Owner

Metadata

annotated_deep_learning_paper_implementations
annotated_deep_learning_paper_implementations copied to clipboard