Release DynaCLR artifacts (model, dataset) on Hugging Face
Hi @ziw-liu 🤗
Niels here from the open-source team at Hugging Face. I discovered your work on Arxiv and was wondering whether you would like to submit it to hf.co/papers to improve its discoverability.If you are one of the authors, you can submit it at https://huggingface.co/papers/submit.
The paper page lets people discuss about your paper and lets them find artifacts about it (your models, datasets or demo for instance), you can also claim the paper as yours which will show up on your public profile at HF, add Github and project page URLs.
It'd be great to make the checkpoints and dataset available on the 🤗 hub, to improve their discoverability/visibility. We can add tags so that people find them when filtering https://huggingface.co/models and https://huggingface.co/datasets.
Uploading models
See here for a guide: https://huggingface.co/docs/hub/models-uploading.
In this case, we could leverage the PyTorchModelHubMixin class which adds from_pretrained and push_to_hub to any custom nn.Module. Alternatively, one can leverages the hf_hub_download one-liner to download a checkpoint from the hub.
We encourage researchers to push each model checkpoint to a separate model repository, so that things like download stats also work. We can then also link the checkpoints to the paper page.
Uploading dataset
Would be awesome to make the dataset available on 🤗 , so that people can do:
from datasets import load_dataset
dataset = load_dataset("your-hf-org-or-username/your-dataset")
See here for a guide: https://huggingface.co/docs/datasets/loading.
Besides that, there's the dataset viewer which allows people to quickly explore the first few rows of the data in the browser.
Let me know if you're interested/need any help regarding this!
Cheers,
Niels ML Engineer @ HF 🤗
@edyoshikun
@NielsRogge thanks for making us aware of this possibility and offering your help.
A few questions:
- By a demo linked to a paper, do you mean a space? Can the space be hosted in our HF org (
chanzuckerberg) that has Enterprise subscription? - We are particularly interested in zeroGPU inference. We use Python 3.11 and above, but zeroGPU supported only python 3.10 when @ziw-liu and @edyoshikun tried last with this space: https://huggingface.co/spaces/chanzuckerberg/Cytoland. Is there a solution?
- Our datasets are quite large (5-10TB in zarr format) and we have a partnership with AWS Open Data. Can we create a dataset descriptor on huggingface (to obtain a DOI) with just the metadata table that points to AWS Open Data?
- Yes, the demo could be hosted on hf.co/spaces using your Enterprise subscription
- I'll ping @hysts on that
- that sounds good! Otherwise feel free to contact us for a storage grant which we might give: https://huggingface.co/docs/hub/en/storage-limits#sharing-large-datasets-on-the-hub
Hi @mattersoflight , unfortunately, ZeroGPU currently supports only Python 3.10, and there's no workaround to use Python 3.11 and above.
Hi @mattersoflight , unfortunately, ZeroGPU currently supports only Python 3.10, and there's no workaround to use Python 3.11 and above.
Hi @hysts is there a roadmap we can follow on this? For context python 3.10 is no longer supported by fundamental scientific packages (SPEC0), including some of our core dependencies, such as numpy.
Hi @ziw-liu , I was wondering about it too, so I asked the infra team internally. I’ll let you know when they reply. cc @cbensimon
I got a reply from the infra team. They don't have a concrete plan to support newer versions yet, but they do recognize that Python 3.10 is getting a bit outdated and said they'll keep that in mind.
Hi @hysts and @NielsRogge any update on your ability to support Python 3.11 with ZeroGPU. Do other inference services support Python 3.11?
Now that python 3.14 is out, we will soon have to move the supported window to 3.12-3.14 to keep up with the rest of the scientific python ecosystem.
Unfortunately, as far as I know, there hasn't been any update on that.