LLMs-from-scratch issues

missing single quotes

If I understand correctly, there should be missing single quotes here due to oversight, and they haven't been displayed.

Intelligence-Manifesto

Make it clear in REAME.md what this repository is for

When reading the README.md for this repository, it's not immediately clear what this repository contains or what it is for. I think this should be clarified.

krikru

Add devcontainer

5

This PR adds basic devcontainer configuration files for docker based development. **Update**: PyTorch version 2.2.0 **Todo**: Add/update docs to refer to docker installation and usage.

rayed-therap

Question about implementation of CausalAttention class (3.5.3 Implementing a compact causal self-attention class)

1

Hi @rasbt, This [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) contains the following implementaion of CausalAttention: ```python class CausalAttention(nn.Module): def __init__(self, d_in, d_out, block_size, dropout, qkv_bias=False): super().__init__() self.d_out = d_out self.W_query = nn.Linear(d_in, d_out, bias=qkv_bias) self.W_key...

labdmitriy

Solution for Exercise 3.2 is included in the notebook with main code (3.6.1 Stacking multiple single-head attention layers)

Hi @rasbt, It seems that cell [36] in the [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) with main code contains solution to Exercise 3.2. Thank you.

labdmitriy

Updated Docker readme

Added CUDA support information to Docker readme file

d-kleine

LLMs-from-scratch
LLMs-from-scratch copied to clipboard

Metadata

missing single quotes

Make it clear in REAME.md what this repository is for

Add devcontainer

Update ch04.ipynb

Question about implementation of CausalAttention class (3.5.3 Implementing a compact causal self-attention class)

Solution for Exercise 3.2 is included in the notebook with main code (3.6.1 Stacking multiple single-head attention layers)

Probably a typo in multi-head attention description (3.6.1 Stacking multiple single-head attention layers)

Inconsistencies in output for dropout section (3.5.2 Masking additional attention weights with dropout)

Inconsistencies between the code in the book and the notebooks (2.6 Data sampling with a sliding window)

Updated Docker readme

← Metadata

Owner

Metadata

LLMs-from-scratch LLMs-from-scratch copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLMs-from-scratch
LLMs-from-scratch copied to clipboard