sccaf icon indicating copy to clipboard operation
sccaf copied to clipboard

how to use original Dimension Reduction information from Seurat.

Open aina91 opened this issue 4 years ago • 4 comments

Hello ,I'm glad to use your softwork. I known SCCAF input file is 'pre-clustered anndata object in the SCANPY package'.I have changed rds object (by Seurat) to Anndata Object , hope to optimze clustering by SCCAF. However, the pre processed dimensional reduction information from Seurat has been add to 'obsm' slot. Therefore I can't use the original dimension reduction information by Seurat. I would feel grateful if you can give me your personal advice at your convenience.

aina91 avatar Jun 07 '20 11:06 aina91

@aina91 you change the dimension reduction slot as well as the clustering slot to store the original information from Seurat (writing to loom files seems to work well):

from SCCAF import SCCAF_assessment, plot_roc,SCCAF_optimize_all
import scanpy as sc

adata = sc.read_loom("/full/path/to/loom/file.loom")
adata.obs["L1_Round0"] = adata.obs["seurat_clusters"]#store the Seurat clusters in L1 Round0
adata.raw = adata 
adata.obsm["X_pca"] = adata.obsm["pca_cell_embeddings"]#store Seurat PCA embeddings

Please let me know if you have any questions.

RegnerM2015 avatar Jun 07 '20 16:06 RegnerM2015

Thx! You really helped me a great favor.

| | 艾娜 邮箱:[email protected] |

签名由 网易邮箱大师 定制

On 06/08/2020 00:57, RegnerM2015 wrote:

@aina91 you change the dimension reduction slot as well as the clustering slot to store the original information from Seurat (writing to loom files seems to work well):

` from SCCAF import SCCAF_assessment, plot_roc,SCCAF_optimize_all import scanpy as sc

adata = sc.read_loom("/full/path/to/loom/file.loom") adata.obs["L1_Round0"] = adata.obs["seurat_clusters"]#store the Seurat clusters in L1 Round0 adata.raw = adata adata.obsm["X_pca"] = adata.obsm["pca_cell_embeddings"]#store Seurat PCA embeddings`

Please let me know if you have any questions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

aina91 avatar Jun 09 '20 12:06 aina91

Hello. I am also using a Seurat object. In my pre-processing, I also use Harmony to correct for different batches, and then I use those Harmony embeddings for clustering.

When I import my Seurat object as AnnData and check the obsm slop, I get this:

adata.obsm
AxisArrays with keys: X_harmony, X_pca, X_umap

I have successfully used the optimize.all function in this way:

adata.obs['L1_Round0'] = adata.obs['RNA_snn_res.4']
sf.SCCAF_optimize_all(ad=adata, plot=True, min_acc=0.9, prefix = 'L1', use='pca')

But I think, since I used Harmony embeddings for clustering, I should use them here too, no?

When I try that, I get an error:

adata.obs['L2_Round0'] = adata.obs['RNA_snn_res.4']
sf.SCCAF_optimize_all(ad=adata, plot=True, min_acc=0.9, prefix = 'L2', use='harmony')

I get the following:

R1norm_cutoff: 0.500000
R2norm_cutoff: 0.050000
Accuracy: 0.000000
======================

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'highly_variable'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-44-b0645bba3846> in <module>
      2 adata.obsm
      3 
----> 4 sf.SCCAF_optimize_all(ad=adata, plot=True, min_acc=0.9, prefix = 'L2', use='harmony')

/opt/conda/lib/python3.7/site-packages/SCCAF/__init__.py in SCCAF_optimize_all(ad, min_acc, R1norm_cutoff, R2norm_cutoff, R1norm_step, R2norm_step, prefix, min_i, start, start_iter, *args, **kwargs)
    652                                                      min_acc=min_acc,
    653                                                      prefix=prefix,
--> 654                                                      *args, **kwargs)
    655         print("m1: %f" % m1)
    656         print("m2: %f" % m2)

/opt/conda/lib/python3.7/site-packages/SCCAF/__init__.py in SCCAF_optimize(ad, prefix, use, use_projection, R1norm_only, R2norm_only, dist_only, dist_not, plot, basis, plot_dist, plot_cmat, mod, low_res, c_iter, n_iter, n_jobs, start_iter, sparsity, n, fraction, R1norm_cutoff, R2norm_cutoff, dist_cutoff, classifier, mplotlib_backend, min_acc)
    783         X = ad.obsm['X_pca']
    784     else:
--> 785         X = ad[:,ad.var['highly_variable']].X
    786 
    787     for i in range(start_iter, start_iter + n_iter):

/opt/conda/lib/python3.7/site-packages/pandas/core/frame.py in __getitem__(self, key)
   2798             if self.columns.nlevels > 1:
   2799                 return self._getitem_multilevel(key)
-> 2800             indexer = self.columns.get_loc(key)
   2801             if is_integer(indexer):
   2802                 indexer = [indexer]

/opt/conda/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'highly_variable'

If you could tell me what I am doing improperly, I'd appreciate it.

achamess avatar Jun 16 '20 12:06 achamess

@achamess, have you try use harmony coordination replace pca coordination ? adata.obsm["X_pca"] = adata.obsm["harmony_cell_embeddings"]#store Seurat harmony embeddings in 'X_pca' Then , sf.SCCAF_optimize_all(ad=adata, plot=True, min_acc=0.9, prefix = 'L2', use='pca')

aina91 avatar Jul 08 '20 14:07 aina91