Miguel de Benito Delgado
Miguel de Benito Delgado
So, to recap, for the meeting: > * Agree => We leave TMCS as-is until we have always-on in-memory caching for single-node parallelism. If users can set up a cluster,...
We might want to drop MapReduce altogether (think of issues with convergence criteria)
> `MapReduce` is in general a problem due to the `StoppingCriterion`. So we should actually try to get rid of MapReduce for Shapley-like algorithms Agreed. A central dispatcher is also...
I think this is becoming more important: different stopping criteria, a joblib backend #276, an increasing number of algorithms...
> I really think the `ValuationResult` class would become too big after this. Perhaps we can split the value part from the rest of it. Makes sense. I was actually...
Hmm... I think we must avoid making a deployment necessary. The joblib backend is a great addition in this direction, and a local in-memory cache would also be. And even...
See https://hanxiao.io/2019/11/07/A-Better-Practice-for-Managing-extras-require-Dependencies-in-Python/ for a flexible approach
We should include rank stability as one of the metrics, as a function of number of retrainings per subset. We could also use models for which we have sample bounds...
Related (stopgap): #204
I think the path of least resistance is scheme. Super-fast development cycle, no toolchain to take care of for compilation (and easier multiplatform distribution) and decent-enough speed (with improvements maybe...