ort
ort copied to clipboard
Investigate IPFS as a storage back-end
See https://ipfs.io/ and https://github.com/ipfs/ipfs. Maybe a good way to share data with ClearlyDefined? What do you think, @jeffmcaffer?
Edit: Probably do https://github.com/oss-review-toolkit/ort/issues/6603 first.
Edit: More (German) resources are
- https://www.golem.de/news/dateisystem-ipfs-fuer-eine-neue-adressierung-und-verteilung-von-inhalten-2308-176495.html
- https://0x0d.de/2022/01/0d089-interplanetary-filesystem/
Interesting tech. How would you see that working? For scenarios like ORT it seems like there are several interactions:
- Getting definitions. This enables ORT users to benefit from the ClearlyDefined community curations etc. in the ORT-based policies and processes. Given some component found using ORT, the user can quickly and easily get the curated license, copyright, ... and drive their workflows
- Contribute curations. If in the the above workflows there are issues with the supplied definitions (e.g., missing license), the user can curate using ORT and supply that curation back to ClearlyDefined for integration and upstreaming.
- Raw tool results. I don't know enough about all of the ORT processing but if ORT wants to do some analysis etc that ClearlyDefined does not do in the summarization while creating definitions, then it can get the raw results from ClearlyDefined and do said work.
Does this match your expectations/understanding?
Yes, I was mostly thinking about your first two scenarios. Phrased differently, use IPFS to create a shared "pool" of curations and / or scan results that any party can contribute to and / or read from.
For reading it could be interesting to look at putting an IPFS front-end on our backing stores. For example, the tool results are all stored in Blob using obviously named blobs. I don't know enough about IPFS to know if this is feasible but you could imagine an implementation that surfaced those blobs in their protocol. Similarly for the curations (those are all stored in a document db).
Writing is different as that needs to be controlled for both tool outputs and curations. We carefully manage which tools, the version and configuration so as to get consistent and trusted results. For curations, those must go through the pull request process for review and transparency.
Note to myself: A comparison to https://github.com/seaweedfs/seaweedfs would be interesting.