Mellon How to run the mellon on data obtained under different processing conditions

Hello, This is a very good tool! However, I have some questions while running the code. 1, I want to calculate the density changes of a specific cell type subpopulation under different treatments. Should I merge the data from multiple treatments and then run Mellon, or should I run Mellon separately for each treatment? 2, If I were to merge the data of multiple treatments for the process, would I need to integrate the data and then use the integrated PCA for running palantir.utils.run_diffusion_maps(adata, pca_key="integrated_pca", n_components=30) ? Thanks for any advice.

Jun 26 '24 06:06 minghao622

Hi @minghao622!

Thanks for your inquiry! We are currently working on establishing a differential abundance framework that will hopefully make this use case a lot easier. However, to answer your questions:

If you want to compare the densities, then they should be trained separately with the .fit method. However, to get density values that you can compare, you will have to evaluate it on the merged dataset with the .predict method.
Yes, palantir.utils.run_diffusion_maps should be run on the PCA of the integrated dataset. Please be aware of the potentially confounding effects of batch-effect correction though. It might be advisable to validate the robustness of any finding with respect to the batch-effect correction method.

Please note, we haven’t established the units of the density values produced by mellon. While differences in the log-density values correspond to the predicted log-fold change of cell-state abundance, the absolute values should not be interpreted at this time

Stay tuned for our upcoming work on differential cell-state abundance.

Jun 26 '24 18:06 katosh

Hi! Any update for this question? I want to visualize the result of of density in normal and disease condition, how can I make the range of mellon_log_density_clipped keep the same?

Nov 04 '24 06:11 xiaozhongshen

Thank you for your patience. There’s no update on this issue yet, but you might find the normalize option for predicted values useful. You can find details in the documentation. Please note that this feature requires the density to be learned using d_method="fractal".

Dec 09 '24 19:12 katosh

Thank you for your patience. There’s no update on this issue yet, but you might find the normalize option for predicted values useful. You can find details in the documentation. Please note that this feature requires the density to be learned using d_method="fractal".

Hi, thanks for your reply. So I need to use d_method="fractal" in the mellon.DensityEstimator function and do I need to change other codes too?

Dec 10 '24 09:12 xiaozhongshen

Yes, using estimator.predict(X, normalize=True) adjusts for differences based on the total number of cells, provided that estimator = mellon.DensityEstimator(d_method="fractal", ...). Keep in mind that this normalization is approximate; the integral of the density function over the entire state space is not guaranteed to equal 1.

Dec 10 '24 22:12 katosh

Thanks! Can you help me check whether the code is right?

model= mellon.DensityEstimator(d_method='fractal') log_density=model.fit_predict(data1.obsm["DM_EigenVectors"]) predictor=model.predict(data1.obsm["DM_EigenVectors"],normalize=True) data1.obs["mellon_log_density"] = predictor data1.obs["mellon_log_density_clipped"] = np.clip( predictor, *np.quantile(predictor, [0.05, 1]) )

However, I found the range of mellon_log_density_clipped and mellon_log_density are different as before, can you help me check that?

Dec 11 '24 04:12 xiaozhongshen

Using this code enables a method to make an estimation of the intrinsic dimensionality of the dataset. This impacts the unit of the resulting log-density and, therefore, its range. It is, however, the only setting for which a normalization is implemented.

Dec 12 '24 08:12 katosh