LLMs-from-scratch icon indicating copy to clipboard operation
LLMs-from-scratch copied to clipboard

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Results 31 LLMs-from-scratch issues
Sort by recently updated
recently updated
newest added

If I understand correctly, there should be missing single quotes here due to oversight, and they haven't been displayed.

When reading the README.md for this repository, it's not immediately clear what this repository contains or what it is for. I think this should be clarified.

This PR adds basic devcontainer configuration files for docker based development. **Update**: PyTorch version 2.2.0 **Todo**: Add/update docs to refer to docker installation and usage.

Hi @rasbt, This [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) contains the following implementaion of CausalAttention: ```python class CausalAttention(nn.Module): def __init__(self, d_in, d_out, block_size, dropout, qkv_bias=False): super().__init__() self.d_out = d_out self.W_query = nn.Linear(d_in, d_out, bias=qkv_bias) self.W_key...

Hi @rasbt, It seems that cell [36] in the [notebook](https://github.com/rasbt/LLMs-from-scratch/blob/main/ch03/01_main-chapter-code/ch03.ipynb) with main code contains solution to Exercise 3.2. Thank you.

Hi @rasbt, I found the following statement in the mentioned section: > Figure 3.24 illustrates the structure of a multi-head attention module, which consists of multiple single-head attention modules, as...

Hi @rasbt, I am trying to explore and reproduce Chapter 3 and found that I can't reproduce results that you specified in the notebook and the book, even if I...

Hi @rasbt, I noticed that in the book you provide the following code with function name `create_dataloader` and the argument `stride = max_length + 1` to avoid overlap in data...

Added CUDA support information to Docker readme file