Awni Hannun
Awni Hannun
Thanks @m0saan. I'm not sure we need parameter groups yet. Let's keep this issue open but I would mark it as low priority until we have reason to observe otherwise....
Hey @Jyun1998 sorry but somehow my main comment did not get included. I must have not pushed the save button by accident. Basically I'm wondering what reference you use for...
@Jyun1998 are you still planning to follow up on this?
@Jyun1998 got it. We should keep it simple until we see that we need more features. Could you follow the [PyTorch cosine similarity](https://pytorch.org/docs/stable/generated/torch.nn.functional.cosine_similarity.html) loss? I think that one covers the...
> Even though there's also margin for pytorch F.cosine_similarity I don't see the margin in the docs? Is it in the source code?
I see, thanks. Yes let's go with the plain cosine similarity for now. Thank you!
Also could you rebase and resolve conflicts?
I think this is separate from parameter groups. We may want parameter groups but I think that is a lower priority at the moment.
Yea I'm not disagreeing that one can also have separate schedules for different parameters. But we can add schedules as a separate feature, without support for parameter groups. If we...
I'm slightly confused, there are two ongoing PRs for scatter ops in MLX 🤔 (#394 has bindings as well). It seems like they are going for different APIs, but o/w...