GitHub Release as a backend
Just looking at the remote storage options here: https://dvc.org/doc/user-guide/data-management/remote-storage
My go-to place to host data resulting from code in GitHub repos is, naturally I think, releases.
gh release create v1 data.csv
Should work, should be possible to support?
Cc @anitagraser whose great tutorial on this led to this question.
@Robinlovelace Could you elaborate, please? I'm not sure I understand how it would work. Every new dvc push creating a new release?
Same as it works for other remote storage options is my thinking.
@Robinlovelace So updating the same release artifacts? Just trying to understand if you have particular ideas or if this is just a general idea.
Yes, just a general idea at this stage, with no GitHub-specific thoughts on implementation.
I remember I was discussing this with someone. The benefit is that by default it has pretty nice limits for a happy-path scenario: Each file included in a release must be under 2 GiB. There is no limit on the total size of a release, nor bandwidth usage.
It felt that even a single release could be a DVC remote by itself if you put it under a proper API.
Also reminds me a bit of this proposal and discussion by @sisp https://gitlab.com/gitlab-org/gitlab/-/issues/413612 and motivation can be similar (we had also a discussion on GitLabFS somewhere in one of our repositories).
I remember I was discussing this with someone. The benefit is that by default it has pretty nice limits for a happy-path scenario:
Each file included in a release must be under 2 GiB. There is no limit on the total size of a release, nor bandwidth usage.It felt that even a single release could be a DVC remote by itself if you put it under a proper API.
I think the minimum set of operations we need are supported in their API:
https://docs.github.com/en/free-pro-team@latest/rest/releases/assets?apiVersion=2022-11-28
But, I wonder if it we need to consider some terms and conditions regarding the usage of release assets