Sebastian Raschka
Sebastian Raschka
Some models have the gguf weights on the hub: https://huggingface.co/QuantFactory/Meta-Llama-3-8B-GGUF/tree/main We would need to find and map those to the respective models one by one I think. Maybe via an...
Hi there, I recently stumbled upon your paper, and Phudge looks great! I was wondering if you considered adding it to ollama so that it can be used in an...
After the next Lightning release, we can increase the supported bitsandbytes version, since Lightning supports this now (see https://github.com/Lightning-AI/pytorch-lightning/pull/20313)
Investigating the RoPE implementation Fixes #1713 Fixes #1699
If we ever have the time, might be nice to add this model checkpoint [microsoft/Phi-3.5-MoE-instruct](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) because it might be a nice usage of the MoE capabilities we added for Mixtral.
Right now we only support the Phi 3 version that supports up to 4k tokens, it would be nice to also support the 128k token version: [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)
### Bug description When running the pretraining example: ```python mkdir -p custom_texts curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt # 1) Download a tokenizer litgpt download EleutherAI/pythia-160m \ --tokenizer_only...
There should probably be an option to disable the KV cache in the Python API as part of the compute/memory trade-off story. (Also, it could perhaps be useful for debugging.)
This PR increases the version since there have been a bunch of changes/fixes since the last release. This makes it a bit easier to detect which version is currently installed...
Fixes the link to the GPT-2 paper.