Nicolas Mejia Petit

Results 52 comments of Nicolas Mejia Petit

> Well, I am not 100% sure if this really works. It just does not raise an exception anymore. If you have first results of a Phi-2 MoE model please...

> Well, I am not 100% sure if this really works. It just does not raise an exception anymore. If you have first results of a Phi-2 MoE model please...

![image](https://github.com/arcee-ai/mergekit/assets/122953474/5144f32d-15ca-44f1-91a0-43b4109e58cb) exactly, does it matter what code phi is using? i can test it out tmr, i used microsoft/phi2 and i have 2 snapshots installed no clue which it used....

> @NickWithBotronics a bit offtopic: For research you could use tiny-llama instead of phi-2. That should work 100%. I saw minichat, got better benchmarks , however I haven't real world...

@cg123 - Do you think you could possibly share me the script for separating agents you used? - [here](https://huggingface.co/chargoddard/demixtral) - I want to try this with the newest 8x22b, released...

Thank you so much! I’m sure it wasn’t easy to find a single python file; within all the different projects you do. I appreciate it, and all the other open...

Raise, edit: I am currently no where near good enough at programming to do this, but it would be pretty cool to use some of the bnb paged 8bit adamw...

If there happens to be a branch, or PR for this, I’d love to see it! Could you share a link?

@enzoli977 Windows is a huge complicated garbage fire. Use the wheels I showed in the home.md in #210 along with installing the proper visual studio build tools, shown in a...

I experienced the same thing! Over 3 epochs same set up just updated code and flash attention, the loss went from 6 to 2. And on the old code without...