mergekit Add a push_to

Very nice project! I'm the Machine Learning Librarian at Hugging Face. We're seeing quite a few merged models produced via MergeKit being uploaded to the Hugging Face Hub. I wanted to ask two questions in response to this:

Have you considered adding a push_to_hub command to MergeKit? Unless I missed something at the moment this not directly included? We could also add a tag mergkit when uploading the model to the Hub, which would allow you to easily find models created via Mergekit.
Related to this, it could be nice to use the YAML config to directly insert some metadata to the model card on the Hub. Some libraries have started adding some very nice auto generated model cards (see SpanMarker as an example). This could be fairly simple but could be a nice way to make it easier for people to discover/understand merged models shared to the Hub.

Happy to help with these if there's interest from your side!

cc @Wauplin @osanseviero

Dec 18 '23 14:12 davanstrien

Just to add, thanks to @julien-c, we will very soon have support for an array of models in our base_model metadata (https://huggingface.co/docs/hub/model-cards#specifying-a-base-model). This should make it possible to easily extract information from the merge config into a format the Hub can understand. It will also make it possible for people to understand which models are most used for merges and similar questions!

Dec 19 '23 16:12 davanstrien

Hey, thanks for reaching out! I'm definitely interested in supporting better indexing and discovery of merged models. For a while I've been thinking that it could be super interesting to track the lineage of merged models and visualize that in some way and I'd be glad to make this sort of data available in a consistent structured format.

Generating a model card with some pre-populated metadata sounds like a great idea. An array of models in base_models is easy enough. An approximate percentage breakdown could also be made if that would be interesting.

As for pushing to the hub inside of MergeKit, I'm hesitant to make that an automatic step. I think at least a little friction is a good idea to encourage actual testing of merges before uploading them. The Hugging Face hub is a really valuable resource for discovering open source models right now and it'd be terrible to inadvertently harm that by flooding the public namespace with one-off random merges. If you're confident in that not being an issue, though, I could definitely be talked into it. :)

I'm on vacation right now and won't be doing any substantial development in the next week or so, but once I'm back this will definitely be a high priority. Excited to make this happen.

Dec 22 '23 11:12 cg123

@cg123 sounds great!

Generating a model card with some pre-populated metadata sounds like a great idea. An array of models in base_models is easy enough. An approximate percentage breakdown could also be made if that would be interesting.

This would be great, we probably couldn't use the percentages in the current metadata but it could be nice to add this breakdown elsewhere in the model card.

As for pushing to the hub inside of MergeKit, I'm hesitant to make that an automatic step. I think at least a little friction is a good idea to encourage actual testing of merges before uploading them. The Hugging Face hub is a really valuable resource for discovering open source models right now and it'd be terrible to inadvertently harm that by flooding the public namespace with one-off random merges. If you're confident in that not being an issue, though, I could definitely be talked into it. :)

Most libraries take the approach of optionally pushing checkpoints during training and have a separate push_to_hub method for pushing a model outside of a training loop. Probably here, the latter makes more sense.

I'm on vacation right now and won't be doing any substantial development in the next week or so, but once I'm back this will definitely be a high priority. Excited to make this happen.

I'm also off until the start of January, but I'm definitely happy to help with this when I'm back! Hope you have a good holiday!

Dec 23 '23 10:12 davanstrien

Add a push_to_hub method