m2 icon indicating copy to clipboard operation
m2 copied to clipboard

Will M2-GPT be open-sourced?

Open yangsp5 opened this issue 2 years ago • 13 comments

Will M2-GPT be open-sourced? It seems interesting

yangsp5 avatar Oct 30 '23 11:10 yangsp5

Yes, will be putting it up this week!

On Mon, Oct 30, 2023 at 4:43 AM yangsp5 @.***> wrote:

Will M2-GPT be open-sourced? It seems interesting

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/m2/issues/7, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIIVJO5WNJNUBBVIQRYLYB6HGTAVCNFSM6AAAAAA6V32Z6WVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3DQMJUGU4TCOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

DanFu09 avatar Oct 30 '23 14:10 DanFu09

Thank you for your great work! Is M2-GPT open sourcing postponed?

LSinev avatar Dec 05 '23 21:12 LSinev

GPT code, or it didn't happen. The extraordinary claims require extraordinary proofs. The paper is very convincing, and INCREDIBLY well written, but does causal as good as you claimed in paper? The best test would be to release the training code in Andrej Karpathy's style of minGPT/nanoGPT/llama2.c.

avesus avatar Jan 25 '24 00:01 avesus

@DanFu09 any update on this? I can't seem to find the checkpoints. At a minimum, I would love to see the yamls so can experiment locally. Great work putting models out with Together AI btw!

lhallee avatar Jan 29 '24 17:01 lhallee

Do you plan on releasing the weights of the causal M2 models, or just the code?

redbrain avatar Feb 17 '24 23:02 redbrain

Hi all, thanks for all the interest here! I’m a bit swamped with faculty apps right now but will try to get the code up in my down time.

The models were quite undertrained (5-15B tokens only) just for an initial scaling experiment so we don’t plan to release them.

On Sat, Feb 17, 2024 at 3:15 PM redbrain @.***> wrote:

Do you plan on releasing the weights of the causal M2 models, or just the code?

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/m2/issues/7#issuecomment-1950522069, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIIXHYUVNTGIILXY2KWLYUE2XXAVCNFSM6AAAAAA6V32Z6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJQGUZDEMBWHE . You are receiving this because you were mentioned.Message ID: @.***>

DanFu09 avatar Feb 17 '24 23:02 DanFu09

Hello, it's been a couple weeks, just wanted to check on the status of the M2-GPT impl release?

redbrain avatar Apr 12 '24 17:04 redbrain

First thing on my list once the faculty interviews finish up! (One more week I promise 🤞)

(it's mostly done sitting on a private branch, just need to fix up a few more bits of configs and merge things)

DanFu09 avatar Apr 13 '24 03:04 DanFu09

Checking in one more time, since it's been another two weeks! Is it possible to get an ETA on the M2-GPT release? (Sorry for the persistent reminders, I understand you're busy and just want to make sure this doesn't get buried under everything else.)

redbrain avatar Apr 28 '24 23:04 redbrain

I'm very hopeful that I'll be able to put it out this week 🤞

DanFu09 avatar May 11 '24 22:05 DanFu09

Here's another two-week check-in, hopefully the last one :) How's it looking right now?

redbrain avatar May 24 '24 19:05 redbrain

Also interested in this, would you be able to release the code?

sanjayss34 avatar Jun 06 '24 01:06 sanjayss34

Hi :)

I uploaded a new config and some code changes to a branch of safari: https://github.com/HazyResearch/safari/tree/flashfftconv.

Please see these instructions and let me know how they work: https://github.com/HazyResearch/safari/blob/flashfftconv/experiments.md#m2-gpt . You'll have to use the old fused_fft CUDA kernel in that repo (hopefully a refactor of FlashFFTConv comes soon to make it all play nice).

If it goes well I'll start the more involved surgery to get the two repos to play nice with each other (maybe just an update of the other one and a link for now).

DanFu09 avatar Jun 13 '24 21:06 DanFu09