badgerdoc icon indicating copy to clipboard operation
badgerdoc copied to clipboard

Report microservice

Open khyurri opened this issue 1 year ago • 0 comments

The current pipeline result can only be a document annotation commit. However, in some cases, pipelines produce artifacts other than annotations. For instance, search results should be displayed in different interfaces than document annotations. To cater to this need, we need to implement a new microservice: report.

report microservice

This microservice allows users and pipelines to commit unlimited reports by job_id. A report entity contains mandatory fields id, job_id, tenant, body, created as datetime, and an optional field task_id.

Commit reports can't be deleted or modified, however we need to add archive property, which will help us to hide some reports from UI if needed.

Limitation: For now we implement this functionality to show search results provided by search pipeline.

body - a valid JSON object, which contains 2 mandatory fields: type and report. type is predefined enum with values: search_results. In case of type=search_results report must contain key search_results with list value. Every value in list is search_result_object explained below:

{
  "type": "search",
  "report": {
    "search_results": [
      {
        "title": "Found document title",
        "text": ["Found text, allowed tags <mark> to highlight part of text. <p>, <i>, <strong> to format text"],
        "url": "[optional] link to found document",
        "page_number": ["[optional] number of "],
        "offset": ["[optional] count of tokens to offset to found text in document"],
        "search_metadata": {}
      }
    ]
  }
}

search_metadata - a free-form JSON to display important data from the search engine or search process.

Getting values

The microservice allows getting a list of reports by job_id with or without the archive flag. The API is compatible with filter_lib. Results are always limited by tenant.

UI

Badgerdoc will contain a new tab Report next to Tasks on the Extraction result screen. The UI must be capable of displaying mixed report types on the same page. For now, Badgerdoc only provides a search results preview. In the near future, LLMS (dialogs) may also be displayed.

khyurri avatar Dec 16 '23 10:12 khyurri