raghavgarg97
raghavgarg97
Hey @hills-code ,could you also add code for converting a model by adding identity blocks for training ? I am excited to use similar techniques for other open-source models like...
i will check that out,Thanks!
@hills-code i was able to get it to work! Btw you only train the added layers and not even the lm head? and also do you think directly going for...
Hey @Beomi ,any clarity on this?it would be deeply appreciated. I currently ran the custom (Type I method) training script to pre-train on dataset but I also wanted to do...
yeah ,that was the first thing i tried but without success.. I would share my prompts and responses here soon
@ykim362 i wasnt able to share the prompts earlier as they were proprietary. But i was able to replicate the issue on a bunch of publicly shareable prompts. (P.S: Conv...
I noticed a similar trend with phi-3.5 ,am i missing something?
Resolved in this thread: https://huggingface.co/microsoft/Phi-4-mini-instruct/discussions/11
I am facing a similar issue when trying to load and save “google/gemma-2b”