jjhoow comments

Results 6 comments of


                                            jjhoow

Token doesn't work

Change it `PROTECT_ROUTES=true`

How can i do continued pre-training using this?

Taking advantage of the question, I would like to know if I can use galore_adamw8bit_per_layer to train specific layers while freezing others? If so, could I use llama-pro (https://github.com/TencentARC/LLaMA-Pro) to...

How can i do continued pre-training using this?

@jiaweizzhao I performed the test earlier with the code above and using galore_adamw8bit_per_layer, i need to get the parameters right because I noticed a roller coaster movement.

Training a Vision Model with Text-only Inputs

> Any updates? Does text-only already work? This approach above works in Qwen based on my tests.

Training a Vision Model with Text-only Inputs

If the text is already in the expected structure for the model with the system, user, and assistant tags, I believe it will work. I trained it using ShareGPT and...

Training a Vision Model with Text-only Inputs

Unsloth is great also because it's hackable! So I tested making the changes above however I wanted. As for the multiple RLHF methods on Hugging Face, there is a brief...