Error in scNT_seq: least squares fails
When changing neuron_labeling.obs['time'] = neuron_labeling.obs.time.astype("categorical") (cell 8) to neuron_labeling.obs['time'] = neuron_labeling.obs.time.astype(float) to circumvent the problem in #4, I get a ValueError: Residuals are not finite in the initial point. while dynamo/estimation/tsc/estimation_kinetic.py:auto_fit runs least squares. Is this just an irrelevant error because one should fix #4 differently?
Full trace:
[<ipython-input-9-5d53bc49f5de>](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in dynamo_workflow(adata)
2 dyn.pp.recipe_monocle(adata)
3
----> 4 dyn.tl.dynamics(adata)
5
6 dyn.tl.reduceDimension(adata)
[~/.local/lib/python3.9/site-packages/dynamo/tools/dynamics.py](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in dynamics(adata, filter_gene_mode, use_smoothed, assumption_mRNA, assumption_protein, model, est_method, NTR_vel, group, protein_names, concat_data, log_unnormalized, one_shot_method, fraction_for_deg, re_smooth, sanity_check, del_2nd_moments, cores, tkey, **est_kwargs)
731 data_type = "smoothed" if use_smoothed else "sfs"
732
--> 733 (params, half_life, cost, logLL, param_ranges, cur_X_data, cur_X_fit_data,) = kinetic_model(
734 subset_adata,
735 tkey,
[~/.local/lib/python3.9/site-packages/dynamo/tools/dynamics.py](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in kinetic_model(subset_adata, tkey, model, est_method, experiment_type, has_splicing, splicing_labeling, has_switch, param_rngs, data_type, return_ntr, **est_kwargs)
1540 cur_X_raw = np.hstack((cur_X_raw[0, 0].A, cur_X_raw[1, 0].A))
1541
-> 1542 _, cost[i_gene] = estm.auto_fit(np.unique(time), cur_X_data)
1543 (
1544 model_1,
[~/.local/lib/python3.9/site-packages/dynamo/estimation/tsc/estimation_kinetic.py](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in auto_fit(self, time, x_data, alpha_min, beta_min, gamma_min, kin_weight, use_p0, **kwargs)
709
710 if use_p0:
--> 711 popt, cost = self.fit_lsq(time, x_data_norm, p0=p0, **kwargs)
712 else:
713 popt, cost = self.fit_lsq(time, x_data_norm, **kwargs)
[~/.local/lib/python3.9/site-packages/dynamo/estimation/tsc/estimation_kinetic.py](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in fit_lsq(self, t, x_data, p0, n_p0, bounds, sample_method, method, normalize)
209 X = []
210 for i in range(n_p0):
--> 211 ret = least_squares(
212 lambda p: self.f_lsq(p, t, x_data_norm, method, normalize),
213 p0[i],
[/usr/local/lib/python3.9/dist-packages/scipy/optimize/_lsq/least_squares.py](https://ndagie4afs-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230321-060141-RC01_518395136#) in least_squares(fun, x0, jac, bounds, method, ftol, xtol, gtol, x_scale, loss, f_scale, diff_step, tr_solver, tr_options, jac_sparsity, max_nfev, verbose, args, kwargs)
835
836 if not np.all(np.isfinite(f0)):
--> 837 raise ValueError("Residuals are not finite in the initial point.")
838
839 n = x0.size
ValueError: Residuals are not finite in the initial point.
Hi @Baschdl thanks for raising up the questions again, please note that the labeling time is always 2 hours and the time is when did we treat cells with KCL. the following code should work:
neuron_labeling.obs['label_time'] = 2 # this is the labeling time
tkey = 'label_time'
# peng_gene_list = pd.read_csv('/Users/xqiu/Dropbox (Personal)/dynamo/dont_remove/0408_grp_info.txt', sep='\t')
neuron_labeling = neuron_labeling[:, neuron_labeling.var.activity_genes]
def dynamo_workflow(adata):
dyn.pp.recipe_monocle(adata, tkey='label_time', experiment_type='one-shot')
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata, calc_rnd_vel=True, transition_genes=adata.var_names)
dyn.vf.VectorField(adata, basis='umap')
neuron_labeling.obs['time'] = neuron_labeling.obs['time']/60
dynamo_workflow(neuron_labeling)
dyn.pl.streamline_plot(neuron_labeling, color='time', color_key_cmap='viridis', basis='umap', show_legend='right')
Would you mind to add more pull requests to the previous one that I closed? Thanks again and please let me know if anything else we can help
Thanks, Xiaojie
Thanks a lot @Xiaojieqiu. I was able to get it to run for labeling data with
def dynamo_workflow_labeling(adata):
dyn.pp.recipe_monocle(adata, tkey='label_time', experiment_type='one-shot')
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata, calc_rnd_vel=True, transition_genes=adata.var_names)
dyn.vf.VectorField(adata, basis='umap')
and for splicing data with the original version:
def dynamo_workflow_splicing(adata):
dyn.pp.recipe_monocle(adata)
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata, calc_rnd_vel=True)
dyn.vf.VectorField(adata, basis='umap')
This also solves #4. I can open a pull request for those changes. Is there a way to only have one dynamo_workflow() for both cases? Otherwise I would open a PR with both variants of the method.
def dynamo_workflow_labeling(adata, **kwargs ):
dyn.pp.recipe_monocle(adata, **kwargs)
dyn.tl.dynamics(adata)
dyn.tl.reduceDimension(adata)
dyn.tl.cell_velocities(adata, calc_rnd_vel=True, transition_genes=adata.var_names)
dyn.vf.VectorField(adata, basis='umap')
I think you can just add kwargs to the dynamo_workflow function to fix the issue.
Please feel free to make the pull request and close the issue #4.