annotated_deep_learning_paper_implementations
annotated_deep_learning_paper_implementations copied to clipboard
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gan...
Hello! Thank you for your fantastic work! I'm currently working on training the latent diffusion model using a custom dataset. I noticed that the repository doesn't include the training code...
Hello, I am a user from China. Due to network reasons, I have been receiving the following warnings during the training process: LABML Warning: timed out failed to connect: https://api.labml.ai/api/v1/track?run_uuid=ec1a4ac018ea11ee9913d8bbc1db2896&labml_version=0.4.162...
I have copied the original code. But that has an error. The running result shows that there is a tensor operation exception in this statement. ```py x_rope = (x_rope *...
fix the following error: ``` Traceback (most recent call last): File "e:\data\frid\python\codes\adlpi\labml_nn\transformers\rope\__init__.py", line 231, in _test_rotary() File "e:\data\frid\python\codes\adlpi\labml_nn\transformers\rope\__init__.py", line 227, in _test_rotary inspect(rotary_pe(x)) File "E:\data\frid\python\p121\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl File...
Hi, thank you for your great work, I really appreciate it ! I'm wondering if I can make annotations just like the website shows, and perhaps running in my local...
${(\textcolor{lightgreen}{\mathbf{A + C}})}_{i,j} = Q_i^\top K_j + \textcolor{orange}{v^\top} K_j$
The [torchvision.transforms.ToTensor](https://pytorch.org/vision/master/generated/torchvision.transforms.ToTensor.html) scale images from range **(0, 255)** to range **(0.0, 1.0)**, but in original paper, it should be scaled to range **(-1.0, 1.0)**.
This experiment is not written in the previous format in group normalization and appears longer. With modifications, this code will normalize and be shorter
Fix typo
https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/a0679ecd90b41b8e012995a6bdf095edae590b17/labml_nn/diffusion/ddpm/evaluate.py#L138 I know how to interpolate by interpolating in the diffused space and then sending it back to the original space. That's why I think the notation is wrong. Please...