Philip May
Philip May
@NickWithBotronics this fix is even better: #150
Well, I am not 100% sure if this really works. It just does not raise an exception anymore. If you have first results of a Phi-2 MoE model please let...
> > Well, I am not 100% sure if this really works. It just does not raise an exception anymore. If you have first results of a Phi-2 MoE model...
@NickWithBotronics a bit offtopic: For research you could use tiny-llama instead of phi-2. That should work 100%.
Hey @cg123 - can you perhaps help to implement this?
> @PhilipMay did you get through this? No but I have success with the Llamafies phi3 version. See here: https://huggingface.co/PhilipMay/Phi-3-MoE-mini-4k-instruct-raw
When I merge phi-3 I get this error btw: ``` mergekit-moe --trust-remote-code ./phi3-merge-2.yml phi3-merge-2 configuration_phi3.py: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.4k/10.4k [00:00
@cg123 Did you combine multiple LLMs to a MoE and then already did some more training and experimentation? Are you willing to share the configurations and ideas behind this? We...
> But if we put that aside a bit, would it be more general if we use `head_params` instead of `logistic_regression_kwargs`? Since it's for two different versions (`sklearn` and `pytorch`)...
Hi Tom, thanks! Are you the new main SetFit maintainer from HF now?