bayeslite
bayeslite copied to clipboard
Implement clone for generators
We require some way to clone a generator (what in Python would be copy.deepcopy(generator)
). An API method (without BQL surface syntax) would be good enough until we decide whether this feature is actually desirable. The issue is blocking for the iap class lab on Thursday. Alternatively we are going to have to carry around roughly 1000 bdb files which is quite nightmarish.
Date: Sun, 10 Jan 2016 14:15:49 -0800 From: F Saad [email protected]
We require some way to clone a generator (what in Python would be
copy.deepcopy(generator)
. An API method (without BQL surface
syntax) would be good enough. The issue is blocking for the iap
class lab on Thursday. Alternatively we are going to have to carry
around roughly 1000 bdb files which is quite nightmarish.
Explain why?
We are interested in running BQL queries interleaved with analysis, in other words
- Analyze generator G for 10 iters
- r1 <- BQL query on G
- Analyze generator G for 10 iters
- r2 <- BQL query on G
- ...
and then store all the "intermediate" generators (ie generator with 10 iters, 20 iters, 30 iters, ...) in a bdb. The reason we need the "intermediate" generator is largely performance based, after obtaining them we can decide what values we are going to monitor the evolution of (such as predictive probability on a test set, or simulation quality, etc).
It is important that r2
is querying a generator that has 10 additional analysis steps from the generator used by r1
, as oppose to an independent generator analyzed for 20 iterations.
So the purpose is to retain historical models so that you can see how answers changed over time?
I feel as though to implement this properly we ought to just store extra bayesdb_crosscat_theta records, by adding a column to it indicating the number of iterations. Normally you would use the most recent theta for each model, but you could also choose older ones. And normally analysis would discard old ones, but we could teach it to save old ones and append new ones.