deeptime
deeptime copied to clipboard
Allowing weights for VAMP dimensionality reduction
Is your feature request related to a problem? Please describe.
The TICA
and VAMP
decomposition classes both provide similar interfaces for .fit_from_timeseries(data)
. However, the TICA
class allows a weights
argument.
The VAMP
decomposition, however, does not support weights, and throws an error if they're provided (see: https://github.com/deeptime-ml/deeptime/blob/11182accb1f8ce263f7c498b76c94bb657b5a998/deeptime/covariance/util/_running_moments.py#L245 )
Describe the solution you'd like Support for weights in VAMP.
I see some similarity between moments_XXXY()
and moments_block()
, but it seems like there was probably a reason for omitting support for weights from VAMP -- is that correct?
Hey JD,
that is an excellent question! I thought a little about this and I don't think there is something that speaks against having weights for VAMP per se. It is just that the weighting was ordinarily used to reweigh many short off-equilibrium trajectories to equilibrium statistics in conjunction with the KoopmanWeightingEstimator. May I ask what you want to achieve with the weights?
Cheers, Moritz
Hi Moritz,
Thanks for the response! Glad to hear there's no theoretical reason it's not doable. We're doing dimensionality reduction on sets of many off-equilibrium MD trajectories, using WESTPA weighted ensemble enhanced sampling. Weighted ensemble trajectories naturally carry weights with them, so we'd like to use those in the dimensionality reduction.
I've implemented weighted TICA with deeptime, but because we're often simulating unidirectional steady-state flows, I don't think the reversibility assumptions in TICA are appropriate, so we'd like to try VAMP.
Our covariance computation is a bit more complicated than the usual (X - mean(X)).T @ (X - mean(X)) / len(X)
because of its online nature, so it might take a while until i get around to implementing this. It is a bit of a hack with double computation, but you can use the Covariance
estimator twice - once on the non-lagged data to compute weighted XX and XY (make sure to set remove_data_mean=True
) and once on lagged data (meaning you skip the first "lagtime" frames, respecting stride if you use that), also here remove the data mean. Then you have weighted XX, weighted YY, and unweighted cross covariance XY (this is never weighted). Finally combine the two CovarianceModel
instances into one by using mean and covariance of the XX, XY
model and the cov_00
of the second model. So in pseudocode:
est_instantaneous = Covariance(remove_data_mean=True, lagtime=100, compute_c00=True, compute_c0t=True, reversible=False, bessels_correction=False)
est_lagged = Covariance(remove_data_mean=True, compute_c00=True, reversible=False, bessels_correction=False)
for X, Y, weights_x, weights_y in your_data_with_lagtime_100:
est_instantaneous.partial_fit((X, Y), weights=weights_x)
est_lagged.partial_fit(Y, weights=weights_y)
model_inst = est_instantaneous.fetch_model()
model_lagged = est_lagged.fetch_model()
from deeptime.covariance import CovarianceModel
model_combined = CovarianceModel(
cov_00=model_inst.cov_00,
cov_0t=model_inst.cov_0t,
cov_tt=model_lagged.cov_00,
mean_0=model_inst.mean_0,
mean_t=model_lagged.mean_t,
bessels_correction=model_inst.bessels_correction,
lagtime=model_inst.lagtime,
symmetrized=False,
data_mean_removed=True
)
from deeptime.decomposition import VAMP
VAMP().fit(model_combined)
Thanks, I think I can work with this! That pseudocode is really helpful to see, I appreciate you taking the time to share it.
Hi @jdrusso did you have a chance to implement this?
Thanks for checking in -- unfortunately I haven't, I had to swap focus to some other things. I know @jpthompson17 was also interested, not sure if he's done anything with it since