Awni Hannun

Results 1014 comments of Awni Hannun

> For renaming the modules so that keys match, how would you suggest handling cases where the Transformers BERT model has more modular/nested modules? e.g. separate BERTIntermediate and BERTOutput layers?...

Great! We also need a readme. I can help with that just let me know you're plan / when I should review.

Hey! I will take a look shortly (next 1-2 days), sorry for the delay!

@andersonbcdefg sorry for the delay. I rebased this and ran the formatting. I'm doing a little work on it now. Just curious, what were the results you were getting? For...

Cool, what about F32, it's about 1% worse than the torch version. Did you see the same? I can spend a little time investigating, but I also want to make...

I think it helped, now I see: ``` {'Banking77Classification': 0.8325974025974027, 'STS12': 0.7584972019004673} ``` The STS12 is still a bit worse than the torch model.. but the banking classification is better..

@andersonbcdefg could you comment a bit on what this example adds beyond the original MLX Bert example? Is it mostly the MTEB evaluation? If so, maybe the right call is...

@andersonbcdefg sorry I got kind of stuck on this myself w.r.t. to how it should integrate with our BERT example. Maybe the answer is that it shouldn't..but then it doesn't...