Sebastian Raschka comments

Results 820 comments of


                                            Sebastian Raschka

Add phi-3 checkpoint

> There is a modeling_*.py file. > Good luck 🙂. Haha, I finally get the weights loaded but of course it's never easy ... of course it's generating gibberish ```...

Some more tidbits via [Daniel Han](https://twitter.com/danielhanchen/status/1782853167572832650): > Phi 3 (3.8B) got released! The paper said it was just a Llama arch, but I found some quirks while adding this to...

Add phi-3 checkpoint

Looks like the sliding window number was a typo: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/commit/b043e05a86cfc77f8d53eb0edf6a33e39afbcb5e

Add phi-3 checkpoint

> The missing piece is the Tokenizer: it has a smaller vocab size (32k vs 50k) that was extended by 64 special tokens. If I'm not mistaken, the current code...

Add phi-3 checkpoint

A related interesting post @Andrei-Aksionov https://x.com/danielhanchen/status/1795453604532207989

Add phi-3 checkpoint

Thanks so much! I am currently moving and offline until weekend/monday. Will take a look when I am back!

Add phi-3 checkpoint

I think the failing tests are because of the new Eval Harness release: https://pypi.org/project/lm-eval/#history I can look into it in a separate PR

Add phi-3 checkpoint

All good now. Big thanks again @Andrei-Aksionov !!

Quizzes report FP but do not report FN when multiple answers are valid

Converted this to an issue to address this in the future. Will need some focus time with our web devs to tackle that.

Update 4_compile.py

Awesome, thanks for jumping in here. Would love to get some insights wrt to how to improve that. I should mentioned, I used CUDA 11.8. Let me try the sample...