LLMs-from-scratch issues

i just check out the code of appendix-A/01_main-chapter-code /DDP-script.py,how about adding ``` from torch.profiler import profile with profile() as prof: #the main function training code if rank == 0: print("exporting...

frankchieng

Chapter 6 wrap-up

1

rasbt

Expected all tensors to be on the same device

There appears to be an issue when running the code from chapter 6 (other sections not tested): ## Error ``` Traceback (most recent call last): File "/home/user/workspace/project/llm/tune_incl.py", line 359, in...

TimRepke

fixed last_two_blocks

* ch05_02, Row 4: `python additional-experiments.py --trainable_layers two_last_blocks` --> last_two_blocks * ch05/06: fixed minor typos

d-kleine

Add Habana Gaudi (HPU) Support

2

This pull request adds support for running inference on Habana Gaudi (HPU) processors by introducing a new directory dedicated to Gaudi-specific implementation. It includes setup instructions, scripts for downloading GPT-2...

BartoszBLL

[Chapter 06] Add “Deploy on Streamlit Community Cloud” section

2

## Proposal I’d like to add a new section at the end of Chapter 06, “Deploy on Streamlit Community Cloud,” which walks readers through: 1. Uploading their trained model to...

vaifai

documentation

fix: small bug fixes/improvements for pkg

Fixes for several issues in the package * fixes #675; code for the `encode` function has been taken from `ch05\07_gpt_to_llama\converting-llama2-to-llama3.ipynb` (please double-check if everything is okay) * fixes `tqdm` import...

d-kleine

Qwen3 0.6b - context length

Could you provide more details about how you determined the context length? I found this information: * The 0.6b model seems to support only 32k (`32,768`) tokens https://qwenlm.github.io/blog/qwen3/#introduction https://huggingface.co/Qwen/Qwen3-0.6B/blob/main/README.md

d-kleine

question

Llama 3 tokenizer - special tokens instance

### Bug description I noticed something small while looking at the Llama 3 tokenizer code and thought it might be helpful to mention: https://github.com/rasbt/LLMs-from-scratch/blob/ece59ba58768db7b34d9b5d5f88677de8c1e84ea/pkg/llms_from_scratch/llama3.py#L315-L316 and https://github.com/rasbt/LLMs-from-scratch/blob/ece59ba58768db7b34d9b5d5f88677de8c1e84ea/pkg/llms_from_scratch/llama3.py#L325-L326 In VS Code, the...

d-kleine

bug

LLMs-from-scratch
LLMs-from-scratch copied to clipboard

Metadata

the above -> the following

suggestion of adding torch.profile

Chapter 6 wrap-up

Expected all tensors to be on the same device

fixed last_two_blocks

Add Habana Gaudi (HPU) Support

[Chapter 06] Add “Deploy on Streamlit Community Cloud” section

fix: small bug fixes/improvements for pkg

Qwen3 0.6b - context length

Llama 3 tokenizer - special tokens instance

← Metadata

Owner

Metadata

LLMs-from-scratch LLMs-from-scratch copied to clipboard

Metadata

← Metadata

Owner

Metadata

LLMs-from-scratch
LLMs-from-scratch copied to clipboard