Sebastian Raschka issues

Results 180 issues of


                                            Sebastian Raschka

Support downloading and using quantized weights (GGUF)

Some models have the gguf weights on the hub: https://huggingface.co/QuantFactory/Meta-Llama-3-8B-GGUF/tree/main We would need to find and map those to the respective models one by one I think. Maybe via an...

enhancement

Phudge Ollama version

Hi there, I recently stumbled upon your paper, and Phudge looks great! I was wondering if you considered adding it to ollama so that it can be used in an...

Use latest bitsandbytes version

After the next Lightning release, we can increase the supported bitsandbytes version, since Lightning supports this now (see https://github.com/Lightning-AI/pytorch-lightning/pull/20313)

enhancement

Improve rope

Investigating the RoPE implementation Fixes #1713 Fixes #1699

Microsoft Phi 3.5 MoE

If we ever have the time, might be nice to add this model checkpoint [microsoft/Phi-3.5-MoE-instruct](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct) because it might be a nice usage of the MoE capabilities we added for Mixtral.

enhancement

model-weights

Support 128k token version of Phi 3

Right now we only support the Phi 3 version that supports up to 4k tokens, it would be nice to also support the 128k token version: [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)

enhancement

model-weights

"RuntimeError: All the chunks should have been deleted." on non-Studio machine

### Bug description When running the pretraining example: ```python mkdir -p custom_texts curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt # 1) Download a tokenizer litgpt download EleutherAI/pythia-160m \ --tokenizer_only...

bug

Sebastian Raschka

Support downloading and using quantized weights (GGUF)

Phudge Ollama version

Use latest bitsandbytes version

Improve rope

Microsoft Phi 3.5 MoE

Support 128k token version of Phi 3

"RuntimeError: All the chunks should have been deleted." on non-Studio machine

Disable KV cache option

Update version.info to 2.4.1.dev1

Fix GPT-2 paper link