LLMs-from-scratch icon indicating copy to clipboard operation
LLMs-from-scratch copied to clipboard

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Results 31 LLMs-from-scratch issues
Sort by recently updated
recently updated
newest added

i just check out the code of appendix-A/01_main-chapter-code /DDP-script.py,how about adding ``` from torch.profiler import profile with profile() as prof: #the main function training code if rank == 0: print("exporting...

There appears to be an issue when running the code from chapter 6 (other sections not tested): ## Error ``` Traceback (most recent call last): File "/home/user/workspace/project/llm/tune_incl.py", line 359, in...

* ch05_02, Row 4: `python additional-experiments.py --trainable_layers two_last_blocks` --> last_two_blocks * ch05/06: fixed minor typos

This pull request adds support for running inference on Habana Gaudi (HPU) processors by introducing a new directory dedicated to Gaudi-specific implementation. It includes setup instructions, scripts for downloading GPT-2...

## Proposal I’d like to add a new section at the end of Chapter 06, “Deploy on Streamlit Community Cloud,” which walks readers through: 1. Uploading their trained model to...

documentation

Fixes for several issues in the package * fixes #675; code for the `encode` function has been taken from `ch05\07_gpt_to_llama\converting-llama2-to-llama3.ipynb` (please double-check if everything is okay) * fixes `tqdm` import...

Could you provide more details about how you determined the context length? I found this information: * The 0.6b model seems to support only 32k (`32,768`) tokens https://qwenlm.github.io/blog/qwen3/#introduction https://huggingface.co/Qwen/Qwen3-0.6B/blob/main/README.md

question

### Bug description I noticed something small while looking at the Llama 3 tokenizer code and thought it might be helpful to mention: https://github.com/rasbt/LLMs-from-scratch/blob/ece59ba58768db7b34d9b5d5f88677de8c1e84ea/pkg/llms_from_scratch/llama3.py#L315-L316 and https://github.com/rasbt/LLMs-from-scratch/blob/ece59ba58768db7b34d9b5d5f88677de8c1e84ea/pkg/llms_from_scratch/llama3.py#L325-L326 In VS Code, the...

bug