guardian icon indicating copy to clipboard operation
guardian copied to clipboard

Facilities to use specialist math tooling (such as R language) for calculations in Guardian Policies

Open anvabr opened this issue 10 months ago • 0 comments

Problem description

Current math tooling support is limited to the facilities provided by math.js library. While generally considered to be a "standard" for the node.js applications, its facilities are limited in the emissions tracking and related domains as it does not provide convenient way to create advanced statistical models.

There are a number of considerations pertinent to the issue above:

  • Complex libraries/packages/interpreters, such as R language, rely on a large library of mathematical functions/expressions packages developed and maintained by 3rd parties. Some of these packages are not open source, and some are open source but of significant complexities. Furthermore, the lifecycle of such packages are independent to Guardian. This may result in the inability or impractically of tracing and/or verifying historic calculations. E.g. in the most simple scenario the policy calculation may contain something like: emissions = COMPLEX_MATRIX_STATS_FUCTION(a, b, c); It can be verified that this was executed by the policy, however there is no guarantee that COMPLEX_MATRIX_STATS_FUCTION will produce the same results, say, 2 years later. Thus in the generic case such calculations can not be verifiable and auditable.
  • The execution of the (R) interpreter would need to be run outside of Guardian since it can be extremely resource intensive, the closest integration model would likely be running it as a micro-service in Guardian deployments. Should 'outside' or 3rd party (R) calculation engines be allowed in the running 'chain' of Policy - presumably accessed via APIs?
  • Some companies may have developed mathematical models which they consider to be their Intellectual Property, and thus intend to keep private. Would closed source implementation of e.g. COMPLEX_MATRIX_STATS_FUCTION need to be supported?
  • There are alternatives to R, for example a standard for data management and AI systems appears to be python. Should other interpreters be supported? Taking it further, should any calculation engines/interpreters be supported for use at run time inside and/or outside of Guardian Policy Engine? How to ensure verifiability of such 'trust chains'?

Requirements

TBD: introduce facilities to execute 'R' language?

Definition of done

TBD

Acceptance criteria

TBD

anvabr avatar Apr 24 '24 11:04 anvabr