mltrace icon indicating copy to clipboard operation
mltrace copied to clipboard

[EPIC] Extend `IOPointer`s to store values in addition to keys

Open shreyashankar opened this issue 4 years ago • 1 comments

Context: the current IOPointer abstraction only stores a string "pointer" to the data, or a key. An example might be features.csv or model.joblib. This means we currently can't do anything with the data, because we don't store any concept of it. If we store data, we could do many things, including the following:

  • Compare current values to historical ComponentRuns' values
  • Identify whether files have been tampered with outside of ComponentRuns
  • Have more fine-grained tracing (record-level)

Storing the data in its entirety may be expensive. For now we will store a hash of the data, to get us one step closer to being able to store the data. This itself may be complex.

Issues

  • [x] #215
  • [x] #216
  • [x] #217
  • [x] #218
  • [ ] #219
  • [ ] #220

In the future, we will incorporate an IOPointer "tag" model to store information about fine-grained tracing (i.e., PK values will be tags). This tagging is out of scope from the current project.

shreyashankar avatar Sep 04 '21 01:09 shreyashankar

Goal: Have all this done by September 15 EOD

shreyashankar avatar Sep 04 '21 01:09 shreyashankar