ort Investigate IPFS as a storage back-end

See https://ipfs.io/ and https://github.com/ipfs/ipfs. Maybe a good way to share data with ClearlyDefined? What do you think, @jeffmcaffer?

Edit: Probably do https://github.com/oss-review-toolkit/ort/issues/6603 first.

Edit: More (German) resources are

https://www.golem.de/news/dateisystem-ipfs-fuer-eine-neue-adressierung-und-verteilung-von-inhalten-2308-176495.html
https://0x0d.de/2022/01/0d089-interplanetary-filesystem/

Feb 24 '19 09:02 sschuberth

Interesting tech. How would you see that working? For scenarios like ORT it seems like there are several interactions:

Getting definitions. This enables ORT users to benefit from the ClearlyDefined community curations etc. in the ORT-based policies and processes. Given some component found using ORT, the user can quickly and easily get the curated license, copyright, ... and drive their workflows
Contribute curations. If in the the above workflows there are issues with the supplied definitions (e.g., missing license), the user can curate using ORT and supply that curation back to ClearlyDefined for integration and upstreaming.
Raw tool results. I don't know enough about all of the ORT processing but if ORT wants to do some analysis etc that ClearlyDefined does not do in the summarization while creating definitions, then it can get the raw results from ClearlyDefined and do said work.

Does this match your expectations/understanding?

Mar 01 '19 00:03 jeffmcaffer

Yes, I was mostly thinking about your first two scenarios. Phrased differently, use IPFS to create a shared "pool" of curations and / or scan results that any party can contribute to and / or read from.

Mar 01 '19 07:03 sschuberth

For reading it could be interesting to look at putting an IPFS front-end on our backing stores. For example, the tool results are all stored in Blob using obviously named blobs. I don't know enough about IPFS to know if this is feasible but you could imagine an implementation that surfaced those blobs in their protocol. Similarly for the curations (those are all stored in a document db).

Writing is different as that needs to be controlled for both tool outputs and curations. We carefully manage which tools, the version and configuration so as to get consistent and trusted results. For curations, those must go through the pull request process for review and transparency.

Mar 01 '19 16:03 jeffmcaffer

Note to myself: A comparison to https://github.com/seaweedfs/seaweedfs would be interesting.

Feb 06 '24 13:02 sschuberth

ort ort copied to clipboard

Investigate IPFS as a storage back-end

ort
ort copied to clipboard