bayeslite
bayeslite copied to clipboard
EXPLAIN IMPUTATION PLAN
From @axch: Crosscat should be able to explain why it made an imputation choice it did, phrased as:
- What are the columns highly dependent on this column (i.e., in the same view)?
- What are the rows that are similar to this row with respect to those columns (i.e., in the same cluster)?
- What are the sufficient statistics of the column over the similar rows (i.e., cluster suffstats)?
This can be done exactly for a single model; explaining an integral over all models might be a little more difficult to visualize. No amputations, please.
This echoes #77, and I'd love to have this soon. It might be a candidate for implementation in recipes because those questions are answerable generically, not only in a crosscat-specific way.
Also, should this particular version of the bug be on crosscat's tracker? If so, perhaps the issue here is to plan an interface by which particular models can tell bayeslite about these imputation choices?
Because of #194 , we might want to accelerate this or something like it into a polish release if it's possible.