guardian icon indicating copy to clipboard operation
guardian copied to clipboard

Global environmental/Guardian data search (indexer) component for Hedera and IPFS

Open anvabr opened this issue 1 year ago • 0 comments

Problem description

Currently each Guardian instances contains a local DB which 'caches' the relevant data (produced by the instance) from Hedera topics and IPFS. This theoretically allows for quick queries, discovery and analysis operation but only within the scope of the current instance.

Requirements

  • Improve the data storage and indexing capabilities of Guardian for the data belonging to the local instance such that complex analytical queries could be run efficiently, such as 'search for data similar to this' and 'what is the possibility of this being a double entry for something submitted elsewhere'.
  • Introduce a global search and indexing capability for data produce by other (all) instances such that queries above could be run on the entire body of Guardian data produced from the beginning of time (in blockchain sense).
  • Extend #2281 for users to be able to preview the usage of the block without having to import "other SR's" policy into their Guardian instance
  • Make sure that user-defined attributes are indexed and can be searched by.
  • Make sure this works with tags - their indexed and can be searched by.

Definition of done

  • A stand-alone deployable global indexing module of guardian can be deployed to run and regularly (or manually triggered) scan the Hedera testnet and mainnet and relevant IPFS storage for data
  • Documentation updated accordingly
  • Example (developer-level) web interface exists where complex queries can be submitted and results displayed, the results for the example queries above look like a sorted list of documents (most 'similar' on top) that are the results of the query.
  • Module functionality, searches, etc is accessible via a standalone UI.

Acceptance criteria

  • The indexing component can be used to search and find documents containing similar data to arbitrary json documents and sub-sections of documents.
  • User can then display the documents in the document 'diff' UI where similar data are highlighted.

anvabr avatar Sep 04 '23 13:09 anvabr